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Preface 



We have endeavoured in this planned three volume work to present an 
exposition of the basic results, methods and applications of the theory of 
random processes. The various branches of the theory are, however, not 
treated in equal detail. 

This volume should be of value principally to mathematicians who are 
interested in studying the theory of random processes. We hope that 
researchers who apply the methods of the theory of random processes 
will also find the book interesting and useful. Prerequisites to the study 
of this book are basic courses in probability theory, measure theory and 
integration, and functional analysis. 

The first volume of '‘The Theory of Random Processes’’ is devoted to 
general problems of the theory of random functions and measure theory 
in function spaces. Some of the material presented in the authors’ book 
“Introduction to the Theory of Random Processes’’ (Ergebnisse der 
Mathematik Band 72) is utilized here. Chapters III, IV, V and IX of the 
Introduction have been revised and now constitute the contents of Chap- 
ters I, III, IV and VI respectively. 

In volume II, the following topics are treated: the general theory of 
Markov processes, the theory of processes with independent increments, 
jump Markov processes, semi-Markov processes and branching processes. 

The third volume deals with the theory of martingales, stochastic 
integrals, stochastic differential equations, diffusion processes and limit 
theorems associated with stochastic differential equations. 



LI. Gihman and A. V. Skorohod 
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Chapter I 



Basic Notions of Probability Theory 



§1. Axioms and Definitions 

Events. The basic notions of probability theory are experiment, event 
and probability of events. 

A formal description of these notions is usually based on the set- 
theoretical model of probability theory developed by A. N. Kolmogorov 
in 1929. 

The experiments studied in probability theory (referred to as stoch- 
astic experiments) are carried out when a certain set of conditions Y is 
satisfied. This set of conditions does not uniquely determine the results 
of the experiment (also called the outcome or realization). This means 
that if the experiment is repeated (provided that the set of conditions Y 
is accurately satisfied) the results of the experiment will generally be 
different. 

When formalizing the notions of probability theory the first funda- 
mental assumption is that the results of a collection of experiments 
under investigation in a given situation can be described by means of a 
certain set Q. Every meaningful event (occurring or not during the given 
experiment) corresponds to a certain subset A of Q in such a manner that 
the probabilistic operations on events correspond to set-theoretical oper- 
ations on the corresponding subsets of Q. 

Moreover, the points coeQ correspond to atoms - namely, every 
event is a sum of points while each point co cannot be represented as a 
sum of other events. For this reason the points belonging to Q are called 
elementary events. 

In relation to Q, an experiment is completely characterized by the 
class of those events (subsets of Q) such that one can assert in each case 
whether it did or did not occur during the given experiment. These events 
are called observable (in the given experiment). 

Henceforth we shall adhere to this model of probability theory and 
identify events with the corresponding subsets of Q. The resulting dual 
terminology is presented below in a glossary translating set-theoretic 
notions into probabilistic notions. 
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Set theory Probability theory 



Space Q 

CO a point of Q 

0 the empty set 

A a subset of Q, A <^Q 

The set A is contained 'm B (A cz B) 

C the sum (union) of sets A and B 
(C = AuB) 

C the intersection of sets A and B 
(C=AnB) 

A the complement of set A 
C the difference of two sets A and B 
(C=A\B) 

Sets A and B are without common points 
(AnB = 0) 



Sure event 
Elementary event 
Impossible event 
Event 

Event A implies B 
C the sum (union) of events 
A and B 

C the intersection (or product) of events 
A and B 

A the contrary event of A 
C the difference of events A and B 

Events A and B are disjoint 



We note that any arbitrary subset of Q is called an event. However, from 
both a practical as well as a purely mathematical point of view it does 
not make sense to regard any arbitrary subsets of Q as events worthy 
of interest. Therefore one must select out of O a suitable class of events. 
This class should be sufficiently wide and contain all the events which 
may arise during the solution of various practical problems. On the 
other hand, the size of this class is limited by the feasibility of effective 
utilization of mathematical techniques. Obviously, the problem of se- 
lecting the corresponding class of events should be solved individually 
in each case, however, we shall always assume subsequently that this 
class forms a cr-algebra of events. 

Definition 1. A class of events 91 is called an algebra of events if it con- 
tains the sure event Q, the impossible event 0 and together with each 
pair of events A and B belonging to the class, their sum as well as the 
contrary event A. 

Two events Q and 0 constitute the trivial algebra. 

The minimal algebra containing event A consists of four events : Q, 
0, A and A. 

Definition 2. An algebra of events which contains a sequence of events 
along with their sum is called a g - algebra. 

It is clear that in the definitions and properties above we could have 
referred to algebras and (7-algebras of sets of a certain abstract space Q. 

Definition 3. The space Q along with the (7-algebra of sets 91 defined on 
it is called the measurable space {Q, 91} and the subsets of Q belonging 
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to 21 are called ^-measurable sets (2l-measurable events) or simply mea- 
surable sets (events) if no ambiguity arises concerning the tr-algebra under 
consideration. 



The cr-algebra of all the events under consideration in a given situ- 
ation is usually denoted by the letter S. With respect to the measurable 
space (O, S) any given stochastic experiment is completely characterized 
by the class of events 5 observed during this experiment. Clearly, (this 
class is contained in Q and it is also evident that the class g is closed 
with respect to the operations of addition, intersection and complemen- 
tation. It is therefore natural to consider g a (T-algebra of events. There- 
fore, formally a stochastic experiment is determined by a certain o- 
algebra g of S-measurable events. We call it the a-algebra corresponding 
to the given experiment. 

Probability. Definition 4. A triple (O, S, P) consisting of a space of ele- 
mentary events Q, a selected c-algebra of events S in Q, and a measure 
P defined on S such that P(f2)= 1 is called a probability space and the 
measure P is called the probability. 

Probability spaces are the initial objects of probability theory. This, 
however, does not contradict the fact that when solving many specific 
problems the probability space is not given explicitly. 

We present below several of the simplest well known properties of 
probability which easily follow from its definition (5 and « = 1 , 2, . . . , 
as given below all belong to S) : 



a) P(0) = O; 

( 00 \ 00 

c)if_5ic52, then P(S2\5i)=P(S2)-P(Si) ; 
d) P(S)=1-P(S); 



e) if SncSn+i, n=l,2,..., then P 



US, 

1 



= limP(S„) 



f) if S„^S„+i 



n = l,2, ..., then P^C) S„J==limP(S„). 



Random variables. The concept of a random variable corresponds to the 
description of a stochastic experiment which measures a certain numer- 
ical quantity It is assumed that for any pair of numbers a and b[a<b) 
the event A {a, b) expressing that ^e[a, b) is an observable event. 

The minimal cr-algebra g^ containing all the events A(a, b), — co<a< 
<b<GC is the cj-algebra corresponding to this stochastic experiment. 

Let ^^(— oo<x<oo) denote the event = This event is measur- 
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^ f 1 1\ 

able. Indeed A^= O Al x — , x-\ — . Moreover, if ^ 1 ^X 2 , events A^ 
n=i V n nj 

and A^^ are disjoint (this follows from the single- valuedness of the mea- 
surement results) and the union of all — 00 < x < 00, is the set Q, since 
the measurement result is always represented by some real number. We 
now define a single-valued real function /(m), meO by setting /(co) = x 
if weA^. It follows from the definition, that f(oji) in each experiment 
and, moreover, that the set {co:a< f{(D)<b} = A{a, b) is measurable. Re- 
call that a real-valued function f{oj) defined on a measurable space 
{O, S} is called measurable (0-measurable) if for any two real numbers 
a and b the set {co:u</(a))<b}eS. Therefore, a random variable ^ can 
be identified with a certain measurable function on the probability space 

{O. ®,P). 

Definition 5. A S-measurable real-valued function of elementary events 
CO is called a random variable ^ (on a given probability space {O, ®, P}). 

Henceforth, we shall occasionally consider measurable functions on 
{Q, S, P} which may possibly take on the values ± 00 also, or functions 
which are defined only on a measurable subset of {O, S, P}. These 
functions are called generalized random variables. 

We note the following point connected with the definition of a 
random variable. It is commonly assumed that from the empirical 
point of view one cannot distinguish between events which differ on 
an event of probability zero. It would therefore be natural to identify 
two random variables ^ and tj which are equal to each other with prob- 
ability 1 and hence interpret a random variable as a class of measurable 
functions, in which each pair of functions may differ only on a set of 
probability 0. Such functions are called equivalent (or P-equivalent). 
This point of view is also justified by the fact that the majority of notions 
introduced here as well as the relationships obtained refer essentially 
to classes of equivalent functions. However, a consistent adherence to 
this point of view presents certain technical as well as basic problems. 
For this reason it would seem more convenient to regard random varia- 
bles as individual functions and use special notation for their equivalent 
classes. 

Definition 6. Random variables ^ and rj are called equivalent (P-equiv- 
alent) i{P{^^rj} = 0. The P-equivalence of 2 random variables ^ and rj 
is denoted hy ^ = rj (mod P). 

Equivalent random variables are also referred to as satisfying ^ = fj 
almost surely (a.s.) or ^ = rj with probability 1. 

Analogous terminology and notation is also used in more general 
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cases. We thus say that a certain function (or certain other objects) 
possess property H almost surely (for almost all o> or for all (d (modP)) 
if the set of co for which this property is not satisfied is of probability 0. 
For example, if a sequence of random variables /„(co) converges to 
^=/(co) for each a; except for a certain set N and P(iV) = 0, we say that 
converges to { almost surely or that 

f = lim(^„ (modP). 

We now present a number of basic properties of random variables 
which follow directly from the corresponding properties of arbitrary 
measurable functions. It is assumed that the random variables are 
defined on a fixed probability space {Q, ®, P}. 

a) If h{ti, 0 an arbitrary Borel function of n real variables 

ti,..., t„, and > in random variables, then h{^^, (^ 2 ? •••» in) is 

also a random variable. 

b) If = is a sequence of random variables, then 

sup(^„, inf^„, lim(^„, lim{„ are also random variables. 

Hence a very wide class of analytic operations commonly performed 
on functions transforms a random variable into a random variable 
independently of the specific form of the cj-algebra S. It is easy to see 
that these operations do not interfere with the equivalence relations 
between the random variables. More precisely: 

c) If 4 and rjn are equivalent («=1, 2, ...), and t„) is a 

Borel function of« real variables, then A (^ 1 , <^n)and/ 2 (f/i, fin) 

are also equivalent. Moreover, the following pairs of random variables 
are equivalent as well : sup and sup rj„, inf and inf rj„, lim and lim rj„, 

and limri„. 

d) Let Az=l, 2, ... be a sequence of random variables. The event 

S = {lim exists} is ®-measurable. It is easy to verify that this event 

can be represented as : 

s=n u O 

k= 1 n=l m\,m 2 >n (, ^ 

Indicators of events serve as an important example of random variables. 
The indicator of an event A is a random variable x^ = X^(co) defined as 
follows : 

if 

X^(co) = 0 if w^A. 

If ^G®, then Xa{^) is ®-measurable. 

Note the correspondence between set-theoretical operations on 
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events and the analogous algebraic operations on indicators : 

00 

1 oo H= E if = for /c#r, 

u fe = 1 

k- 1 

X^\bH = X^H-XbH> if b^a, 
xita A„ (") = i™ xa„ H , xiim A„ H = Mm xa„ (") ■ 

A random variable ^ is called discrete if it admits only a finite or 
countable number of distinct values. Such a variable can be expressed 
as = ^ CkXAk{t^\ where are S-measurable sets pairwise disjoint and 

k 

U = For each co only one summand is nonzero in the r.h.s. of the 

k 

last equality and = if cog For an arbitrary random variable ^ one 
can always construct a sequence of discrete random variables taking 
on only a finite number of possible values and converging to ^ for each 
CO. To prove this assertion it is sufficient to set 

” /. k-l\ 

^n= L L L7 + -— 

j= -n k=l \ ^ / 

where 

^jk—ito.j -{ — — y 
I n n) 

It then follows that |^ — (^„|<-, if \^\<n. 

n 

It is easy to verify that for a non-negative ^ one can construct a mono- 
tonically increasing sequence of non-negative discrete random variables 
(taking on a countable number of values) uniformly converging to 
Indeed, in this case we set 

^ ^ { k 

in=L^-^XA^, where 

Then 0 < C - <2“" for all co. 

Random elements. The notion of a random variable can be generalized 
to the notion of a random element with the values in an arbitrary mea- 
surable space S}. Let [Q, and S} be two measurable 
spaces. The mapping gico^x [xeSC) is called a measurable mapping of 
[Q, ®} into S} if g~^{B) = {(D:g{cD)eB]e^ for an arbitrary ^g®. 

Definition 7. A random element ^ with values in a measurable space 
S} is a measurable mapping of [Q, P} into {^, ©}. 
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If ^ is a metric space then S is always assumed to be a a-algebra of 
Borel sets (unless stipulated otherwise). If ^is a vector space, then ^ is 
called a random vector. 

Let a sequence of random elements A:= 1, 2, be given, 
defined on a fixed probability space {f2, S, P} with values in the spaces 
Sfc} correspondingly. This sequence can be considered as a single 
random element C, which will be called the direct product of random 
elements with values in a measurable space {^, ©} where 

n n 

^=Yl the product of the spaces ® = 0 ®fe 

k=l 1 

is the product of the cr-algebras ®i, - ? ®n- 

The last remark is also valid in the more general case of an arbitrary 
set of random elements oce A, with the values in S^} where A is 
a set of indices. Here the product ^ represents the space of all 

A 

the mappings (a): aeA, i.e. the space of all functions 

defined on A admitting a value in for each aeA. 

A cylindrical set in ^ is called a set C of all satisfying the relations 

of the type 

k= 1,..., n, 

Here n is an arbitrary integer and olj, are arbitrary elements of A. 
More precisely, we call C = C^^ x ... a cylindrical set with 

the bases ...x over the coordinates a^, a 2 , ..., a„. The 

minimal a-algebra containing all the cylindrical sets is denoted by ® and 
is called the product of c-algebras ® = H observe 

aeA 

that the mapping ^:co-^y(a) defined by the relations g{cD) = g{(o, a) = 
= /a(co) where is a measurable mapping of {Q, S} into {^, ®}. 

If all are the same, = then ^ = represents the space of all 

functions with values in defined on A and the mapping ^(cu) associates 

a function from with each elementary event co ; in other words the 
mapping g (co) is a random function. Thus, the family of random variables 
aeA} may be regarded as a random function. 

Let /(co) be a random element with the values in ©}. 

Definition 8. A o-algebra generated by a random element is a (T-algebra 
or g(^) consisting of all sets of the form {/" ^ (B)\ Be^}. 

Clearly the class of sets {f~^(B) \ Be^} is a (T-algebra. 

The following statement is an equivalent formulation of the above : 
the (T-algebra is the minimal cr-algebra in Q with respect to which the 
random element ^ is measurable. 

It is intuitively clear that measurability of a certain random variable g 
with respect to means that fy is a function of 
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Lemma 1. Let be a random element on {Q, P) with values in 

®} and r\ be a a ^-measurable random variable. Then there exists a S- 
measurable real valued function g (x) such that f]=g (<^). 

Proof. Assume that ^7 is a discrete random variable admitting values 
n = Let {a>:rj = a„}. Then there exists such that 

n — 1 

PutB; = B„\ U Bfc. The sets are disjoint, = 

k= 1 

n — 1 /oo\oo 00 

= ^„\ U = and f~^[[JB'„\=[JA„=^Q, i.e. f{Q)<=\JB'„. Now 

k=i \ 1 / 1 1 



putgr(x) = a„ ifxeB'„. Then t]=g{^). 

We now consider the general case. There exists a sequence of discrete 
(7^-measurable random variables rj„, convergent to rj for each co. There 
fore rjn = QniO^ where g„{x) is ©-measurable. The set of points S on which 
the functions g„ (x) converge to a certain point is ©-measurable, it con- 
tains f{Q) and lim^„(x) = lim?7„ = ^ for xef{Q). Putting ^(x) = lim^„(x) 
for xeS and ^(x) = 0 for x^S we obtain rj = g{i). □ 



Mathematical expectation. The mathematical expectation of a random 
variable is its most important numerical characteristic. This notion 
corresponds to the intuitive notion of the value of the arithmetic mean 
of observations on a random variable in a long sequence of identical 
stochastic experiments. 

By definition the mathematical expectation of a random variable 
^=f{o)) is equal to the integral of f{co) with respect to the measure P. 
We denote it as 




Q 



f{ojl)P{dcD)= i dP. 






Often the designation Q of the region of integration is omitted. Mathe- 
matical expectation possesses a number of properties which are well 
known from the theory of abstract integration. 

Convergence in probability. Various types of convergence of sequences 
of random variables play an important role in probability theory. The 
definition of convergence with probability 1 (almost surely) was presented 
earlier. 



Definition 9. If there exists a random variable ^ such that for any 2 > 0 
P{l^n-‘^l>4^0 as n-^co, 
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we say that the sequence 1, 2, ...} converges in probability to the 

random variable ^ and denote 

In measure theory convergence in probability corresponds to con- 
vergence in measure. The following corollaries follow from the general 
results of measure theory: 

a) If a sequence 1, 2, ...} converges almost surely it converges 

in probability. The converse is generally not true. However, a subsequence 
which converges almost surely can be selected from a sequence of random 
variables convergent in probability. 

b) A necessary and sufficient condition for convergence in probability 
of a sequence of random variables is as follows: for arbitrary s>0 and 
3>0 2 LnnQ = n{s, d) can be found such that for n and n' > H q 

This condition is called the condition of fundamentality in probability of 
the sequence = 1 , 2, . . . } . 

c) If = P-lim and q = P-lim then ^ = rj (mod P). 

d) Let rjk=P-\imij^„(k=l, 2, and let the function (p{ti, 

t^) be everywhere continuous in the m-dimensional Euclidean space 
except possibly on a Borel set D[Dci0r) such that 

Then the sequence = ^mn) converges in probability to 

= particular, if the sequences are convergent in 

probability, so are the sequences + cmd latter 

under the assumption that P {P-lim ^ 2 n = 0} = 0 and, moreover 

P-lim((^i„+{2„)=P-lini(^i„+P-lim^2», p <^in_ P-lim 
P-lim(^i„-^2«)=P-lim^i„P-limi2„, P-lmi{2„' 

A sufficient condition for convergence with probability 1 as stated be- 
low is useful in various specific problems : 

Lemma 2. If there exists a sequence s„>0, such that 

00 00 

Z P{l4 + 1-U>8n}<00> E 

K= 1 n— 1 

then converges with probability I to a certain random variable 
If for any 8 > 0, 

00 

Z P{|^-^„|>e}<«, 

«=1 

then converges to ^ with probability 1 . 
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Proof. Let A„ denote the event + ^ Then 

( 00 00 \ 00 

n U ^P(^„)=0. 

m=l « = m / m->oo m 



Therefore, the terms of the series starting with some 

1 

index m = m{co) are dominated with probability 1 by the terms of the 

00 

convergent series ^ £„. This proves the first assertion. Next, let 

n=l 

Then 

r 00 00 00 'I 00 

P{lim|(J-4l>0}=P< U n lim P(B,vn) = 0> 

fiV=l m=l n = m ) N-*oo m~*oo n = m 

which proves the second assertion. □ 

ifp-spaces. By £Pp = ^p{Q,S,P) ip^\) we denote a linear normed 
space of random variables ^ on (O, S, P) satisfying E|^|^<oo. The 
norm in is defined by 

The convergence of the sequence to its limit ^ in (the if^-con- 
vergence) signifies that 

E|^ — as n->oc. 

The J^p-convergence implies convergence in probability. This fact 
follows directly from Chebyshev’s inequality 

The space is complete. The most important .^^-spaces are = 
and if 2- We shall now discuss if 2 in some detail. Note that all the 
definitions above and the theorems in this section are valid with no 
modifications for the complex- valued random variables. 

The space if2 = if2(^^, P) of complex-valued random variables 
becomes a Hilbert space if we define in if 2, for each pair of random 
variables f and r\, their scalar product putting it equal to E(J//. 

Two random variables C and rj are called orthogonal if ECij = 0 . In 
the case when C and rj are real and EC = Erj = 0 , orthogonality is equivalent 
to the property that variables are uncorrelated. Convergence of the 
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sequence {{„; « = 1, 2, ...} in if 2 ^ random variable C means that 

||C-CJP = EK-C„P-0 as n^cx). 

This type of convergence is called the mean-square convergence and is 
designated as C = l.i.m. 

Note that the scalar product is a continuous function of its arguments. 
In certain cases it is convenient to express the condition of convergence 
in if 2 ill terms of the covariance of the family of random variables. 

Definition 10. The function B{t^, called the covariance of 

a set of random variables CrGif 2 , 

In this definition T designates an arbitrary set. 

Let a non-negative function \l/[t) be defined on T which takes on 
arbitrarily small values. 

The random variable rj{rje^ 2 ) is called the limit of the family 
teT) in if 2 (m.s. limit) as if for any e>0 a (5>0 can be 

found such that 

for all t such that 0<ij/{t)<d. 

Lemma 3. For the existence of the limit of a collection of random variables 
teT} as \l/{t)^0, it is necessary and sufficient that the limit of the 
covariance B{t, f)= EliXc as ^{t) + \l/{t')-^Q. If this condition is 
satisfied and rj= l.i.m. Ct, then 

lim B{t,t). 

Proof, Necessity. The necessity follows from the continuity of the 
scalar product. 

Sufficiency. Let there exist lim^(/i, ^ 2 ) = ^o ^{ti) + il/{t2)-^0. Note 
that ^0 is non-negative {BQ = \imB{t, t) as \l/(t)~^0). Therefore 



E \C„-i:,y=B{t„ t,)-2 

for + It follows from the completeness of ^2 there 

exists l.i.m. for \j/{t)-^(). Moreover, 

for 

i.e. \\r]\f = E \rj\^ = limB{t, t) for this completes the proof of the 

lemma. □ 



The Hilbert space if^ = =^2 P) of random vectors with values 

in the m-dimensional complex space Jf"* is defined in an analogous 
manner. This space consists of random vectors C with the values in J""* 
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for which E |CP< oo. Here the scalar product of two random vectors C 
and rj is defined as E((, rj) where (x, y) is the scalar product in \x\^ = 
= (x, x). Lemma 3 is also valid in the space if 2 if we define B{t, t') as 

E(C. U 

Distributions of random vectors. Let be a random element with 
values in the measurable space ®}. The distribution of the random 
element ^ is the measure fi induced in ®} by i.e. 

= Bern. 

Values of any statistical characteristic of a random element ^ may 
be determined by means of its distributions. Indeed 

Efii)={f(x)^i(dx) (1) 



3C 

for any ®-measurable function / (x) such that one of the sides of equality 
(1) is meaningful. Formula (1) is the rule of change of variables in abstract 
integrals. 

Distributions in metric spaces are studied in Chapter V. In the present 
section we consider distributions in Here ®"* denotes a c-algebra of 
Borel sets in Distributions in are defined by means of distribution 
functions. Henceforth we shall write a<b (a^b), a = {a^ , , a"^)e 

h = ..., if a'<b' {a"^b') (f=l,...,m). The set {x:x<a} 

will be denoted by f. The function 

F(x) = n(I,) = P{^<x} 

is called the distribution function of a random vector i (or the distribu- 
tion function of the measure //). 

Sets of the type I\_a,b) = {x:a^x<b} are called intervals in We 
now express the probability that a vector ^ falls into an interval in terms 
of its distribution function. We introduce the notation 

— G(x\ x'‘~ a, X*'*' x"*) 

for any function G(x), xe^^. 

The quantity A[^ fj^F(x) is the probability of the event 

It is easy to verify that 

In addition to intervals I]_a, b) we shall also consider closed intervals 
I[a,b^ = {x:a^^x^^b\j=l,2,...m} as well as open intervals I{a,b) = 
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= {x\a^ <x^ <b\ j=\, 2, ... m}. We note several properties of distribu- 
tion functions : 

2) if x^y, then F{x)^F (y) ; 

3) where //[u, h) = /i(/[^ ^)) is determined by formula (2); 

4) F {x-0) = F{xy, 

5) F(x)->0, provided that at least one of the coordinates of point x 
approaches — oo; 

6) F(+oo, + 00 , -j- oo)= 1. 

Lemma 4. For an arbitrary function F{x) in satisfying conditions l)-6) 
there exists a unique probability measure on whose distribution function 

coincides with F(x), 

Consider the class SR of intervals /[a, b) in This class forms a 
semi-ring. Define on SR a set function F[I[a, b)) by the expression appear- 
ing in the r.h.s. of formula (2). The function F{I[a, b)) is an additive 
function on SR. 

In order that F be extended to a measure on S'" it is necessary and 
sufficient that it satisfy the semi-additivity property, i.e. the inequality 

M 

F(/[ao,^>o))<I (3) 

k= 1 

should be satisfied for any system of semi-intervals /[a^, bj,) (k= 1, 2, ...) 

CO 

such that U f bo). This extension on S'” is unique. We 

1 

now verify condition (3) in the case under consideration. 

Since F(x) is continuous from the left, for any ^>0, ^^>0 can be 

rj 

found such that 0:^F(/[afc — bjf) — F(Ilak, bjf)<^, where 8^ = 

= (8^, . . . , 8^) (/: = 1 , 2, . . .). The open intervals — e , bj,) cover the closed 
interval [<2 q, bo — s\, e>0. In view of the Heine-Borel theorem, a finite 
subcover can be extracted, for example {{a^ — 4? ^fc)} ? k=l,2,...,n. Thus 
the sequence of intervals {(Ufe — % h^)}, /c=l, 2,..., w, covers the interval 
[(2o, bo~s). The disjoint sets 

[flo, k=l,...,n, 

are sums of disjoint intervals Af\j=\, 2, ..., m^). Thus 

n mu: 

K,^o-s)= U U 

k=l 7=1 
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n mk 

F{I[_ao,bo-s))=Z 

k=l j=l 



n CO oo 

< Z H^lak-ek, bij)^ X **))< Z P(h) + ri- 

fc=l k=l k=l 



Approaching the limit as s-^0, we obtain 

00 

F(/[«o, ^o))^ Z m + ri- 

k= 1 

Since t/>0 is arbitrary, inequality (3) and the lemma are proved. □ 

Definition 11. A sequence of finite measures on is called weakly 
convergent to the measure fi (on ®"”) if for an arbitrary bounded and 
continuous function / (x) 



f(x)fi„(dx) 

% 



J f{x)ii{dx). 



( 4 ) 



A family of measures is called weakly compact if a weakly convergent 
subsequence can be extracted from an arbitrary sequence. 

The following theorem is valid. 



Theorem 1. In order that a sequence of measures on ®^} be 
weakly compact it is necessary and sufficient that a) and b) for 

any e>0 any interval I\a, b) can be found such that 



lim fi„{I[a,b))>ti„{^)-£. 



( 5 ) 



The proof of this theorem is given in Section 1 of Chapter VI. 

Characteristic functions. The function J{u), u={ui,U 2 ,...,u„), defined 
by 






J(u,x) 



p{dx) 



is called the characteristic function of random vector ^ (or of the corre- 
sponding distribution pi) in 

The following properties of characteristic functions are obvious : 

1) J(0) = l,|J(w)|<l 

2) J{u) is uniformly continuous, 

3) for any n, any complex numbers and any (j= 1, ..., n) 

n 

J,k=l 
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Conversely if a function possesses properties l)-3), it is then the charac- 
teristic function of a certain distribution. The proof of this assertion is 
given in Section 2 of Chapter IV. 

One can define distributions on 0T by means of characteristic func- 
tions because the latter uniquely determines a distribution. For example, 
given a distribution with density / (x), the characteristic function 




^'(“’^)/(x) dx 



is a Fourier transform of the function f(x). Moreover, if f{x) satisfies 
certain additional conditions (which are discussed in detail in the theory 
of Fourier integrals), then f{x) can be recovered from J{u) using the 
formula 



/w= 



1 

(M 



e J(w) dw. 



An analogous inversion formula can also be given for the distribution 
function F{x) in the general case. However, we now present a theorem 
concerning the uniqueness of determination of a distribution function 
from its characteristic function without using the inversion formula. 



Theorem 2. If 



^i(u, = 






where fii are measures on then lii=li2- 



Proof Denote by K the class of complex-valued bounded Borel functions 
for which 



f(x) Hi{dx) = 



f(x) H2(dx). 






( 6 ) 



We show that K contains all the bounded Borel functions. Clearly, K is 
a linear class. Since it contains functions it also contains all the 
possible linear combinations of these functions P(x) = ^ ^Hak,x)^ Since 

k 

K is closed with respect to the operation of taking limits of sequences of 
uniformly bounded functions converging everywhere to a certain limit, 
and since in view of Weierstrass’ theorem an arbitrary bounded con- 
tinuous function f{x) can be approximated by a uniformly bounded 
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sequence P„(x) convergent to f{x) for any it follows that K con- 

tains all the continuous functions. In view of K's closure with respect to 
taking limits, it now follows that K contains all the bounded Borel 
functions. Putting in (6) /(x) = %b(x), Pg®"”, we obtain that = 
□ 

Next we establish the connection between the weak convergence of 
the distributions jii„ (to /a) and the convergence of their characteristic 
functions. Let J{u) and J„(m) denote the characteristic functions of 
distributions fx and jx„. It follows by definition from the weak convergence 
of ju„ to fi that 

A more profound result is contained in the following theorem: 

Theorem 3. If J^{u) are convergent for each u to a certain function cp[u) 
and (p (u) is continuous at u = Q, then the distribution jx„ converges weakly to 
a certain distribution g and <p(u) is the characteristic function of the 
distribution g. 



Proof We first show that //„ is a weakly compact sequence of measures. 
Let A = {a, .... a). We have 



1 



I 

[~A,A] 



(1 — J„(w)) du = 



1 



{1 — e = 






[-A,A] 



m 

1-n 



jt=i 



sin ax 
axk . 



gn{dx)^ 









[~Ai,Ai] 



(2 2 2 \ 

where denotes the vector I-,-,...,-). Approaching the limit in the 

\a a aj 

inequality obtained ns n-^ go and using Lebesgue’s dominated conver- 
gence theorem we obtain 

^i]) I {l-(p{u))du. 

[~A, A] 



In view of the continuity of q>{u) at w = 0, the r.h.s. of the last equality 
approaches zero as a-^0. It thus follows from Theorem 1 that the sequence 
g„ is weakly compact. Next we show that g^ weakly converges to a certain 
limit. Indeed, there exists a subsequence g^. weakly convergent to a 
distribution g^. If g^ does not converge weakly to g^, then one can find 
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another subsequence which is weakly convergent to a limit /x'o different 
from fj,Q. However, it follows from the remark above that (p{u) is the 
characteristic function of distribution /Iq well as of distribution fj.Q. On 
the other hand the characteristic function uniquely determines the 
distribution, thus /io = /^o- The contradiction obtained shows that jji„ is 
weakly convergent to fiQ. The theorem is thus proved. □ 

We note some additional oft used properties of characteristic func- 
tions. 

If and ^2 independent random vectors in = and 

Ji{u) are the characteristic functions of ^,(1 = 1, 2, 3), then 

Mu) = Jiiu)J2{u). (7) 

Next, let 2) be random vectors with the values in and 

(^3 = ((^ 1 , (^ 2 ) fh® composite vector with values in x In order that 
(^1 and ^2 be independent it is necessary and sufficient that 

J2{u,v) = Ji{u) J2{v), ( 8 ) 

where 



/ 3 (m, v)= J,(u) = Mu, 0), J 2 (v) = J3(0, v). 

The necessity of this condition is obvious. 

The sufficiency follows from the uniqueness of a distribution function 
with a given characteristic function. 

Definition 12. The moment of an .s-dimensional vector ^ = 

= is the quantity 

provided the expectation in the r.h.s. of the equality is finite. The value 
q=ji+j 2 ~^ ^ Js is called the order of the moment. 

It is easy to verify that if E |(^^|^< 00, A:= 1, 2, ..., ^, then all the mo- 
ments of orders q^p are finite. Indeed, if follows from the relationship 
between the arithmetic and geometric means (^=y‘i +7*2 "I 7s) fbat 

k=\ k=i k=i q 

hence 



s 



00 



E n E ov«)E z ov3)(Eia"r. 



Moments n^j^ with integer-valued indices may be computed from 
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the characteristic functions by means of differentiation. Indeed, 



m: 



M-iY 



d^J{u) 

dukduif...dui“ 



(9) 



M = 0 



for if E|(^^|^<oo. The proof of the formula follows from the fact 
that one may differentiate under the sign of mathematical expectation 
in the formula 



In certain cases it is necessary to use the converse assertion. The latter, 
however, holds only for moments with even indices. Let be the opera- 
tion of taking the symmetric finite difference in the variable and let 
be its 7-th power. Then 

= J(Wi, . . Mfc + h^, u,), 

u,)= E (-!)'■ C7jJ{ui,..., U/, + (j-2r)h^, u^). 

»*=o 



Hence, 

S 2jk 
fe=l r=0 

s s ® /dn/i 

= E n n E n (-r^) 

k=l t=l k=l ^ / 



or 






/=! (2hr^’‘ 

Using Fatou’s lemma we obtain 






k=l \ K^'‘ 



(- 1 )« YIAI^'^J 

k=l 



lim - 

fe=l 



^E n 

k=l 



u = 0 



The expression in the left-hand-side of the obtained inequality is equal 
(up to the sign) to the derivative /du\^^ evaluated at u = 0, 

provided J is differentiable 2q times. 

We have thus obtained the following theorem. 
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Theorem 4. If the characteristic function is p dijferentiable 

(where p is an even integer) at point m = 0, then there exist moments of 
order q^p which may be evaluated using formula (9). 

Random time. Suppose that stochastic experiments are carried out con- 
tinually. Let a certain event A be considered. The realization of event A 
can be determined by observing the results of the experiments up to a 
certain random instant of time. Such an instant of time is called random 
time. Occasionally it is referred to as a random variable independent of 
the future, as a Markov moment or as stopping time. 

The formal definition is as follows : 

Let T denote the set of real numbers corresponding to the times at 
which the stochastic experiments were carried out. 

Definition 13. A monotonically decreasing family of cr-algebras 
teT), 5(C:S, if ty<t 2 on a given probability space {Q, S, P} 

is called a current of o-algebras (a current of experiments). 

Here are interpreted as the class of all the observed events in the 
experiments carried out up to the moment t inclusively. 

Definition 14. A random time on a current of o-algebras teT} is a 
function t = f{co) with the values in T defined on a subset of the space 
Q and such that {t ^ t} g for any teT. 

The condition means the following: one may infer about 

the occurence of the random moment x before the time t by observing 
the results of the experiment in the instant s of times s, seT, s^t. The 
set corresponds to the event that t occurred during the time period T. 
Obviously, is ® measurable. If T possesses the maximal value 
then If such maximal value exists and 

sup{t, teT}, then Q^= (j{x^tk}. 

k 

Note that if Q^ = Q the condition {x^t}e%^is equivalent to the requi- 
rement that {t > /} G or in the case of a countable T to the requirement 

that {x = t}e'St for any teT. 

One can associate with the random time t a minimal n-algebra of 
events such that by observing the results of experiments up to times x 
inclusive one may infer about the realization of these events. Denote by 

the class of events B {Be Q) such that Bn{x^t} e%t for any teT.lt is 
easy to verify that is a n-algebra. We call a n-algebra generated by 
the random time x. 

Clearly the random variable x is g^-measurable. 

As an example consider the case x = to, t^eT. Then is either 

0 or (2 so that t = to is a particular case of a random time. Moreover, 
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is either 0 or ^ so that = notation introduced 

for the cT-algebra generated by a random time, agrees in the particular 
case T = /q with the previous notations. 

We present a number of properties of the random time t on a fixed 
current of a-algebras {5^; teT}. 

a) if is a Borel set on the real line and {supx:xeAT}^^ then the 
event {teK} is g^-measurable. 

b) If 6{t) is a real Borel function, which maps T into T and 6{t)^t 
(te T), then 6(t) is a random time. 

This property follows from the previous one. 

c) If Tf are random times (/ = 1, 2), then min(Ti, T2) and max(Ti, T2) 
are also random times. In particular, the quantity min(T, tg) teT is a 
random time provided that t is such. 

The proof of these assertions follows from the fact that 

{max(Ti, T2 ) ^ = (ti < /}n{i2 ^ t) 

and 

d) If Tj (/ = 1 , 2) are random times and t ^ ^ T 2, ^ 5ti ^ 5x2 • 

Indeed, let Since {tj =5 {t 2 =^/}, then y4n{T2 =^n{Ti </} 

n{T 2 ^t} = Bn{z 2 ^t}e'St view of the fact that j5 = v4n{Ti ^/}G(5i 
and {t 2 ^/}g 5^. 

Let r be a finite or a countable set, ; teT} be a current of a-alge- 
bras, T be a random time on (g^; /gT}, = Consider the set of ran- 

dom variables teT} such that is g, -measurable for each teT. Set 
= if = The quantity is thus defined for all coeQ. 

Lemma 5. The variable is ^-measurable. 

Indeed, let /:= 1, 2, ... be the set of possible values of t. Then 

{o):4<x}n{a;:T^/}= U {{co:i^<x} r\{co:T = Cj^}) = 

Ck^t 

= U ({w:«^(c,t)<x}n{w:T = cJ)€g,, 

Ck^t 

since each event in the last sum belongs to g^^ cz g^ □ 



§2. Independence 

Definitions. Let {Q, S, P) be a fixed probability space. Unless otherwise 
specified, events in this section are understood to be S-measurable sub- 
sets of Q. 
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Two events A and B are called independent if P(^n^) = P(^) P(5). 
The statements below follow directly from the definition : 

a) Q and A, where A is an arbitrary event, are independent. 

b) If P(A/^) = 0 and A is arbitrary, then N and A are independent. 

c) If A and (/ = 1, 2) are independent, = 3 ^ 2 ? then A and B^B 2 
are independent. In particular, A and B^ are independent. 

d) If A and B^ are independent (/= 1, 2, ..., «), and B^, B 2 ,..., B^ are 

n 

pairwise disjoint, then A and U B^ are also independent. 

i=l 

Note that without the stipulation that B^^ are pairwise disjoint, the 
last assertion does not generally hold. 

e) A is independent of A if and only if P(>4) = 0 or P(^) = 1. 

Let / be a set, let {501^, /g/} be the set of classes of events enumerated 
by means of the index /, taking on values from /. 

Definition 1. The classes of the events {501^, iel] are called independent ( or 
jointly independent) if for arbitrary pairwise different indices - ? h 
(4e/) and arbitrary k=\, 2,..., n, 

P{Ai^nAi^n-- nAj = P (^i.) P ...P{AJ. 

Note that for an infinite collection of classes of events the definition 
of independence is equivalent to the requirement that an arbitrary finite 
subcollection of classes of events will consist of independent classes of 
events. 

Henceforth, the notation (j {2K} will denote the minimal a-algebra 
containing 901. 

A class of events 91 is called the n-class if it is closed with respect to 
the operation of intersection of events g 91, A: = 1 , 2 implies A ^ n ^2 

and the class of events is called the k-class if 

a) v4jtG9l:A:= 1, 2 ,..., and A,^nA^ = 0 for k^r implies that 

00 

U A,g^ ; 

k=i 

b) (2 g 9I and A:=l, 2 implies Evidently if 

91 is simultaneously a Tc-class and A-class it is also a a-algebra. 

Lemma 1. If the k-class 91 contains the n-class 901, then cr(90l) is contained 
in 91. 

Proof. Denote by 91^ the minimal A-class containing 901 (9Ii is the inter- 
section of all the 2-classes containing 901). We show that 91^ =(t{ 90J}. Let 
91 (^) denote the class of all events A in 91^ for which AnB=S &^ . 

It is easy to verify that 91(B) is a 2-class. If Bg 901, then 91(B) 3 901 
(since 901 is a Tc-class). Therefore 9I(B) = 91 i(Bg 901). But this means that 
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for any i.e. 91(^4) = 9li. Thus, 9Ii is a 7r-class. In view 

of the previous remark 91 1 □ 

Theorem 1. Let iel] be a collection of independent n-classes. Then 
the minimal a-algebras C7{S[RJ, ief are independent. 

Proof. Without loss of generality, a finite number of classes S0?i, ..., 
will be considered. It is sufficient to show that if one of the classes i.e. 

is replaced by then the new sequence of classes will also be 

independent. 

Denote by 91 the class of all those events which do not depend on 
•••? By definition, 501 ^c: 91 and 91 possess the following proper- 
ties: it is closed with respect to summation of non-intersecting events 
and taking the difference under the condition that B 2 ^B^. Thus 

91 is a >l-class and in view of the preceding lemma 91=><7{50li}. The theo- 
rem is thus proved. □ 

Theorem 2. Let {501^, iel] be a collection of independent classes of events 
each of which is closed with respect to the intersection operation and let 
I=Ii u I 2 , (/i n/ 2 ) = 0. Denote by 93^- (/= 1, 2) the minimal a-algebra con- 
taining all the SRf, ief. Then 95 1 and 952 independent. 

In view of the previous theorem, it is sufficient to consider the case 
when are tr-algebras. Consider classes 9Ij(y=l, 2) consisting of all 
possible events of the form Ai^nAi^n--nAi^, Ai^eMi^ where n is arbi- 
trary and /fcG/j. These are closed with respect to intersections, 91^ con- 
tains all ielj and 91^ and 9 I 2 are independent. In view of the previous 
theorem o-{9li} = o-{S0li, ze/i} and cr{9l2} = o-{50lj, /e/ 2 } are indepen- 
dent. 

Corollary. If I is subdivided into arbitrary collection of subsets /= U 1^1 

jeM 

pairwise disjoint then the o-algebras {Sj = o-(S[Rf, iel^JeM] are jointly 
independent. 

Independent random variables. Definition 2. Random variables {Cj, iel] 
are called independent (jointly independent) if the classes of events 90?^, 
/ g /, where 501,- consists of the events of the type {co : < a}, — 00 < a < 00 , 
are independent. 

The definition of independence of a collection of random variables 
is equivalent to the following: random variables (/ g /) are independent 
if for any n and any /^ g /, the joint distribution function of 

the variables Cij, ’ is equal to the product of the distribution 
functions of the variables „ 

fc=l 
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The independence of a collection of classes of random variables is 
stated analogously. 

Consider a collection of classes of random variables {Cf , where 

is a fixed index and / runs over the set depending on the index fi. 
For convenience, we refer to this set as a class and consider a collection 
of such classes enumerated by means of the index ji running over the 
set M. 

Definitions. Classes of random variables {Cf, (//gM) are called 
(mutually) independent if the sets of events ySl^{peM) are mutually in- 
dependent. Here consists of all the events of the form 

( 1 ) 

«= 1 , 2 , ..., — 00 00 . 

Definition 4. A (j-algebra of events c7{Ci, /e/}, generated by the events 
of the form 



«=1, 2, ..., 4g/, — oo<%<oo, 

is called a a-algebra generated by the class of random variables {Ci, iel). 
The closure of the a-algebra iel) is denoted by iel]. 

In other words, iel} is the minimal cr-algebra of events with 
respect to which all the Ci are random variables (i.e. it is the minimal 
(7-algebra of sets, with respect to which all the functions Ci=/i(<^) are 
measurable). 

We note in particular that the cr-algebra of events generated by a 
single random variable C is the minimal cr-algebra containing events of 
the form {( 0 'f<a] (— oo<a< oo). 

Theorem 3. If classes of random variables {Cf, /e/^}, jugM, are indepen- 
dent, then the collections of a-algebras (r{Cf, /e/^}, peM and their clo- 
sures i^Ifi} are also independent. 

The proof follows from the fact that the classes introduced in Defi- 
nition 3 are closed with respect to intersections of events inside the class 
and from Theorem 1 . □ 

CoroUary. Let ^ 2 ? •••? 0 be a set of finite Borel functions of 

s real variables. If the sequences of random variables {(Ci, C2? • • •? Cs), p^M] 
are jointly independent, then the random variables = C 2 ? •••5 Cs)? 

peM are also independent. 

The notion of independence of random variables and the theorem 
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proved above are easily generalized to the case of random elements in 
an arbitrary measurable space ®}. 

Let be a set of random elements in The elements 

{Ci i^I} are called independent (or jointly independent) if for an arbitrary 
n, M = 1, 2, ... and arbitrary i^el, 

p|n{C,,6Bj|=n (2) 

U=i J fe=i 

A collection of independent classes of random elements is defined 
analogously. 

An arbitrary collection of random elements generates the minimal 
(j-algebra of sets in Q, with respect to which every random element is 
measurable. From the independence of a certain collection of classes 
of random elements follows the independence of minimal tr-algebras 
(and their closures) generated by the corresponding classes of random 
elements. The proof of this assertion is analogous to the one for random 
variables. 

Let a sequence of random elements ^k=fk{<^) k=l, 2, 

. . ., n be given. This sequence may be considered as a random element with 

n 

values in Y\ ^k- Indeed, denote by the product of a-algebras 

k=l 

®i,..., ®„. If C = A^ x ^2 X ••• ^*=1? 2,..., n, then 

{m:(/i(co),...,/„(m))GC}= H {cD:fi{o))eAi}, 

i=l 

i.e. the preimage of C is ©-measurable. Hence, the preimage of any set 
belonging to the minimal c-algebra containing all C, i.e. the preimage 
of any set in ®^"^ is ©-measurable. Denote by the measure in 

I n ^k^ ®^"^| induced by the sequence (^i, . . ., Q, i.e. 

Assume that the elements /c=l, ...,n are independent. Then 
formula (2) shows that 

^l,2,...,n(^l'^^2<^---^^) = ^l(^l) ^2(^2)--- 

where m^{Ak) = P {^k^Aj^}. In view of the uniqueness of continuation of 
a measure from a semiring of sets onto a minimal cr-algebra the measure 
^ 1 , 2 ,..., the product of measures m^, m 2 , m„. The converse is trivial: 
if the measure 2,...,n coincides with the product of measures m^, m 2 , 
..., m„, then the random elements (^ 2 ? • ••» in independent. We have 
thus obtained 
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Theorem 4. The random elements (^ 2 ? independent if and 
only if the measure m^ 2 ,...,n induced by sequence ^ 2 ? 

(7-algebra is the product of measures m^ (A: = 1 , . . . , induced by the 

elements on . 

Theorem 5. Let g{x^, xf) be a -measurable finite function and and 

^2 be independent random elements and let^ moreover, 

Then(p{xj) = Eg{x^, ^ 2 ) is a ^ ^-measurable function of x^ and 



or 



E0(^i, ^ 2 ) 



= j* my{dxi) j* 



X 2 ) ni2{dx2). 



^2 



The same formula remains valid for an arbitrary ®^^^-measurable 
function where the sign indicates the completion of a cr-algebra (or 
measure) if the measures and m 2 are assumed to be complete. The 
theorem is a direct consequence of the theorem on change of variables 
for abstract integrals, which yields that 

Efif(^i, ^ 2 )= j* 9(xi,X2)mi_2{d{xi,X2)), 

X^2 



and of Fubini’s theorem. □ 



Corollary. If and ^2 independent random variables with finite ex- 
pectations, then 

E^,^2=^^T^^2 

Zero-one law. Let «= 1, 2, ..., be a sequence of events. 

00 

Theorem 6. If Y, P(^n)< 00 , then the event lim24„ is of probability 0. 

«=i 

00 00 

The proof follows from formula limyl„= pi U Afc which yields that 

n=l k = n 



P(lim24„)= lim P 

«->oo 



k = n 



U Y P(^fc)=0- 



n-^00 k = n 



□ 



The following stronger result is valid in case of a sequence of inde- 
pendent events. 

Theorem?. (Borel-Cantelli’s lemma). If events {A„, n=\, 2,...} are in- 
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dependent then the probability of the event limy4„ is either 0 or \ depending 

00 

on whether the series I P{A„) is convergent or divergent. 

n= 1 

00 

One need prove only that if ^ P(^„)= oo, then P(limy4„) = 1. If = 

n=l 

= limA„, then 

00 00 

n\^*= u n 

n= 1 k = n 

and 

( 00 \ 00 00 

n (0\^) =lim n P(0\A,)=\im n (1-PK))=0 

k = n / n-^ao k = n n->oo k = n 



in view of the divergence of the series ^ P(^fc). □ 

k= 1 

Consider now an arbitrary sequence of independent tr-algebras 
n=l, 2, .... In view of the Borel-Cantelli lemma the event ^ = lim^„ 
where A„ is an arbitrary sequence such that is of probability 

0 or 1. This result can be generalized to arbitrary events generated by the 
collection of all cr-algebras n = l, 2 ,... and which are independent of 
the arbitrary finite sequence of a-algebras S 2 , More precisely, 
let = be a cr-algebra generated by the se- 

quence ®„, n = k, /c + 1, . . . ; form a monotonically decreasing sequence 

00 

of (7-algebras. Their intersection ® = H is also a cr-algebra. Define 

k= 1 

00 

k=l 

Clearly the (7-algebra lim remains unchanged if we replace an arbi- 
trary finite number of (7-algebras by some other (7-algebras. 

Theorem 8. (Kolmogorov’s general zero-one law) « = 1, 2, ..., are 

mutually independent a-algebras, then any event in lim has probability 
either 0 or I . 

Indeed, let A e lim ® „ . Then ^ g ®^ for any k and thus A and (7 { S ^ , . . . , 
are independent. Since y4e(7{®i,...,®„,...} A does not depend 
on A. But this is possible only if either P(^) = 0 or P(^) = 1 . □ 

Corollary. Let {^^,n=\,2,...] be a sequence of independent random ele- 
ments in a fixed metric space {^, ®}, ®„ — cr be a a-algebra generated 

Then 
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a) the limit of the sequence «= 1, 2, ...} exists with either proba- 
bility 0 or with probability 1 ; 

b) if is separable and complete, then the limit of the sequence 
n = \,2,...} is constant (modP) with probability 1 provided it exists. 

c) Ifz=f{x^,X2,>.-,x„,...)isa function of infinitely many arguments 

n = l, 2 ,... and for every n ^„-measurable, 

then the function is constant with probability 1 . 



Proof a) If ^(x, }^) is the distance in ^ then the set of points on which 
is convergent can be written in the form 



n u n 

fc = 1 n = 1 n',n"^n 






Since the events A^= O 

n\ n"^n 



L")< 



1 

k 



are monotonically increas- 



ing, U = for every r so that Dg®^ for every r and the general 

n=l 

zero-one law is applicable. □ 

b) Let F be a closed set, Fcz^^ denoted by the open set F^ = 
= |x:e(x,f) <i|. Then the event A = Dn{\im^n^F} can be represented 



00 00 



as A = Dn\ 



nun 



Hence, in view of the same consider- 



Lfc=l «=1 n'^n J 

ations as those used in the proof of a), x4g®^, r=l,2,.... Therefore 
P{limi„eF}=0 or 1 for any closed F. But the class of events for which 
an analogous assertion is valid is a (7-algebra, therefore P {Um^„eB} = 0 
or 1 for any Bg®. From here it easily follows that in the case of a sep- 
arable and complete space ^ the measure m induced on ® by the random 
element is concentrated on a single atom. Indeed, since m(^) = l 
a sphere of radius 1 can be found such that m(5i) = l. If there were 
no such sphere then all the spheres in ^ of radius 1 would have been of 
measure 0 which is impossible since ^ is covered by a countable number 
of such spheres. Analogously, a sphere S2 of radius j can be found such 
that 82^ Si and m{S2) = l> Continuing this argument we obtain a se- 
quence of nested spheres of measure 1 with radii approaching zero. 
These spheres have only one point x in common and m{x} =limm(S„) = 



= 1 . □ 



c) By assumption, the events = { co :/ ((^ 1 , . . . , , . . . ) < a} g ® „ ; there- 

fore x4g limS„ so that A is of probability either 0 or 1. Therefore, the 
distribution function of the random variable ) takes 

on the values 0 or 1 only and the variable C is a constant with probability 
one. □ 




28 



Chapter I. Basic Notions of Probability Theory 



§ 3. Conditional Probabilities and Conditional Expectations 



Definitions. We first recall the definition of the conditional probability 
and conditional expectation in the elementary case. The conditional 
probability P[A\B) of an event A given B such that P[B)i^^ is defined 
by the relationship 



P(A\B) = 



P{AnB) 

P{B) ■ 



For a fixed B the conditional probability P{A\B)is a. normed mea- 
sure defined on the same cr-algebra of events as the “unconditional” 
probability P {A). Correspondingly the conditional mathematical expec- 
tation of a certain random variable ^ =/ (co), given B, is defined by the 
formula 



E{^|B}=j f{(o)P{d(o\B). 

Q 

Taking into account the definition of conditional probability this 
relationship can be rewritten as follows: 



P(B)E{^|B} = J 

B 



^dP. 



( 1 ) 



To be able to define conditional mathematical expectations and con- 
ditional probabilities given events of probability zero it is necessary to 
reconsider these notions. Firstly, we note that if ^ is the indicator of 
event A then E{^\B}=P{A\B), Therefore conditional probabilities are 
particular cases of conditional expectations and we shall consider mean- 
while only the latter. Let 501 be a countable class of disjoint events 

00 

{Bii i = 1, 2, . . . , ®} and U Bi = Q, Define a random variable E 1 501} 

i = 1 

to be equal to E{^\ H coeBi and call it the conditional mathematical 
expectation of ^ given the class of sets 501. This variable is defined only 
for the values of co which belong to Bi such that P{B^)^0, i.e. the random 
variable E 1 501^} is defined with probability 1. This variable is constant 
on the set BjG 501 such that and is equal to the conditional math- 

ematical expectation of ^ given 5^. Observe that knowing E{<^ 1 50t} one 
can define not only E | Bi} for B,e50l, P{Bj}^0, but also the conditional 

mathematical expectation of the random variable given an arbitrary B, 

00 

with P{B)^0, belonging to (j{50l}. Indeed if 5= U Bj^, then 

fe=i 
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00 

P(B)E{^|B}=X (2) 

k= 1 

This formula shows how one can compute conditional mathematical 
expectations for given countable sums of sets by knowing the conditional 
mathematical expectations for the given jB/s (i= 1, 2, ...) and hence any 
conditional probability given an arbitrary set from the smallest cr-algebra 
containing all Note that relation (2) can be written as follows: 

\ idP={ E{^|9K}P(rioj). 



B B 

This relation holds for an arbitrary B in the c-algebra generated by 9K, 
and the random variable SR} is measurable with respect to this 

(7-algebra. 

It is easy to verify that these properties define the conditional math- 
ematical expectation uniquely (modP). Indeed, if there exist two g-mea- 
surable random variables rji (i = 1, 2) for which 




B B 



for any Be^ (g is a cr-algebra) then rj^ and rj 2 coincide P-almost every- 
where. 

The last observation can be used for the definition of conditional, 
mathematical expectation in the general case. Let a certain experiment 
described by a cr-algebra of events ® be carried out. It is required to de- 
termine the conditional mathematical expectation of a random variable 
under the assumption that the result of the experiment is known. This 
conditional mathematical expectation is considered to be a function of 
the result of the experiment, i.e. as a ©-measurable random variable 
satisfying the relationship just obtained. 

Definition 1. Let © be an arbitrary cr-algebra of events contained in S 
and ^ an arbitrary random variable for which the mathematical expec- 
tation exists. The conditional mathematical expectation of a random vari- 
able ^ given o-algebra © is called the random variable E{c^ | ©} measur- 
able with respect to © and satisfying the equality 

^E{i\^}cP = ^^dP (3) 

B B 



for any J5 g©. 
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The existence and uniqueness (up to an equivalence) of the random 
variable E {(^ | ®} follows directly from the Radon-Nikodym theorem. 
Indeed, the r.h.s, of (3) is a tr-finite countably-additive set function on ® 
which is absolutely continuous with respect to measure P. Therefore 
there exists a ©-measurable function g (co) such that 




The function g(co) is unique (up to an equivalence). This function is by 
definition the conditional mathematical expectation of ^ given the a- 
algebra ©. 

Remark. Let S be the completion of © with respect to probability P. 
It is easy to verify that 

E{c^|©}=E{(^|©} (modP). 

Since the class of ©-measurable functions is wider than the class of ©- 
measurable functions, it is sometimes expedient to consider the condi- 
tional mathematical expectation given the completed (T-algebras. 

The conditional probabilities P {^ | ©} given the ^--algebra S are 
defined as a particular case of the conditional mathematical expectation 
by putting <^ = z^(co). 

Definition 2. For a fixed A the conditional probability P {/I | ®} is a 
©-measurable random variable satisfying for every BeS8 the equation 

|p{/l|»}dP=P(^nB). (4) 

B 

Properties of conditional expectations and conditional probabilities. In 

this section we shall always assume that the random variables under 
consideration possess finite or infinite expectations and that the asser- 
tions stated or proved are valid with probability 1 . 

a) 

b) If ^ is a ^-measurable random variable then 

ER|®} = ^ 

In particular if event B is ^-measurable, then 

P{5|®}=Xb 

c) EE{^ |®}=E^ 

d) //E^ i^oo, 1=1, 2, then 

E{a^, + bi, I ®}=aE{^i | ®}+6Efe | ®}. 
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To prove the last assertion it is sufficient to verify that its r.h.s. satisfies 
the definition of the conditional mathematical expectation of the random 
variable +Z>(^ 2 - 

Setting = 81^82 =9 we obtain as a particular case, the addi- 

tivity of the conditional probability 

I »}=P{5i I »}+P{52 I ®}. 

e) If the sequence «= 1, 2, ...} is a monotonically decreasing se- 
quence of non-negative random variables, then 

The proof of this assertion is an immediate consequence of the ap- 
plication of the Lebesque monotone convergence theorem to the equality 



ER„l®}rfP 



^ndP. □ 



For conditional probabilities the above result yields : 

if { 8 ^, «=1, 2, ...} is a monotonically increasing sequence of events^ 
then 

limP{B„|S}=p{u B„i» 

U=i 



if A^, «= 1, 2 , are pairwise disjoint, then 

00 r °° 1 

X PM„|©}=P u (5) 

Remark. The last property of conditional probabilities does not mean 
that they can be considered for a fixed co as countably-additive set func- 
tions. For a given sequence the probability that equality (5) is not 
satisfied is zero, but the corresponding exceptional event depends on the 
choice of this sequence. Therefore it is possible that there is not a single 
(D for which (5) is valid for all the sequences A^ in ®. 

f) If the random variable ^ and the a -algebra ® are independent, then 

E{^|®} = E^ (6) 

The independence of the random variable ^ and the cr-algebra © by 
definition means that (j-algebras a{^} and © are independent. Therefore 
for an arbitrary J3e© 

I ^dP = E^XB = ^iP{B), 

B 
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and equality (3) will be satisfied if we set | ®} 

It follows from the property just proven that if event A does not 
depend on cr-algebra then 

P{A\^}=P(A). (7) 

g) If rj is a ^-measurable random variable, then 

E{^t7|»}=t,E{^|®}. (8) 

To prove this property it is sufficient to consider the case when rj is 
non-negative. 

If/y = XB,,5iG», then 






dP = 



I E{^|»}tP= j 

BnB\ Br\B\ 






idP= 



rjidP, 



B 



so that equality (8) is satisfied. Since the class of random variables satis- 
fying (8) is linear and closed with respect to passage to the limits of the 
monotone non-negative sequences, it contains arbitrary ©-measurable 
non-negative random variables. □ 

Since the conditional expectation E {^ | ©} is a random variable, one 
can examine the conditional expectation of this variable given another 
(7-algebra S^. We thus arrive at the iterated conditional mathematical 
expectation E{E{(^ | ©} | ©J. We now establish an important property 
of this operation. Note that if © and ©i are two (7-algebras and ©^ cz © 
it follows from the definition of the conditional mathematical expecta- 
tion that E {E I ©J I ©} = E {(^ I ©i}. 

The following property is more profound : 
h) Let then E{E{(^ | ©J | ©} = E{c^ | ©}. 

Indeed if 5 g© then 5 g©i, hence 



E{E{^|»i}|®}dP 



E{^|®JrfP 



idP = 



E{^|S}rfP. 



B 



B 



Comparing the extreme terms in this chain of equalities, we obtain 

E{E{^|»i}|«} = E{^|»}. □ 

It follows from the result just proved that if ©ci©^ and is a ©j- 
measurable random variable, then 

E{^fy|®}=E{^E{^|®,}|»}. (9) 

The following assertion which generalizes g) is often used. Let C = h{ojl), 
rj = s{cD) be two measurable mappings of {Q, S} into 91} and £} 
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correspondingly. Let g{x,z) be a numerical function defined on ^ 
measurable with respect to cr {91 x (£} and let the mathematical expecta- 
tion of g{C, rj) be finite. 

i) If rj is ^'measurable (95 cz S), then we can define E {^(C, z) | 95} such 
that 



To prove this assertion we note that in view of g) the last formula is valid 

n 

for the functions of the form g{x, z)= Y, 9k{^) For arbitrary func- 

k=i 



tions g{x, z) such that E\g(C, rj)\ <co the assertion follows from the ex- 
istence of a sequence of functions h„(x, z) of the previous type and such 
that /z„(C, rj) converges to g{C, rj) with probability 1 also in □ 



Conditional mathematical expectation given a random variable. Let C be 

a random variable taking on values z^, Z 2 , ..., P(C = z„)> 0 , de- 
notes the event {C=^n} 






P(AnB„) 

P(B„) 



is the conditional probability of A given C = ^«- The conditional mathe- 
matical expectation of variable ^ given C = is defined by the formula 



Em=z„}=j 

Q 






^dP. 



Regarding this sequence of numbers as a function of the outcome of 
the experiment to determine the value of C, we arrive at the notion of 
the conditional mathematical expectation of ^ given the random variable 
C. This is the random variable E{c^ | Q taking on values E{^\C = Zn} if 
C = z„. This definition coincides with the previously stated definition of 
the conditional mathematical expectation given a countable partition 
of the space Q. The place of SR is taken by events {R„; n = 1, 2, . . .}. The 
last remark leads us to a general definition. 

Consider a measurable mapping C = g(co) of the space (O, S) into the 
measurable space {^, 95}. Thus C is a random element with the values 
in Let 5c ^ tr-algebra generated by the mapping ^;5^ = {S':5 = 

Definitions. The random variable E (ij | is the conditional math- 
ematical expectation of a random variable ^ given a random ele- 

ment c. 
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This definition is equivalent to the following: for any BeS 

I E{^|C}tP= I ^dP. (10) 

g~HB) g-HB) 

The conditional probability given C is defined analogously; 

p{T|a=PMi5j. 

Theorem 1. The conditional mathematical expectation given the random 
element C is a ^-measurable function of C 

ERia=40, 

where s{x) is a '^-measurable function. 

Proof Let ^ be non-negative. We have 



j E{^|C}dP = 

g-HB) 



^dP^q{B), 

~HB) 



( 11 ) 



Clearly, q (B) is a a-finite measure on ®. Moreover, q(B) = 0ifP{g~^{B)} = 
= 0, i.e. q is absolutely continuous with respect to measure Pg, where 
Pg{A) = P {g~^{A)}. In view of the Radon-Nikodym theorem there exists 
a ©-measurable non-negative function s{x) such that 

s{x)Pg{dx). 

B 



Applying the rule of change of variables, we obtain 

I s(0(oj))P(dco). 

g-HB) 

Comparing the last equation with (11) we obtain the equality 
E{<j|C}=s(0(ra)) = s(O. □ 

The theorem just proven shows that the conditional mathematical 
expectation given the random element C can be considered as a function 
of a variable x in the measurable space {^, S, P^}, while in the initial 
definition the conditional mathematical expectation is a function of an 
elementary event co. The function 5(x) is uniquely determined by relation 

idP = ^s{x)P^(dx), 

g~HB) B 

which is valid for 5 g©. 



( 12 ) 
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We now present several propositions which follow directly from the 
previously obtained properties of conditional mathematical expectations, 
a) Let ^ and C be independent. Then E{{ | Q = 
h) If ^ is a ^-measurable random variable, then 

c) If ^i=Qi{<^) (^re measurable mappings of {Q, S} into SJ 
(/ = 1 , 2) then 

where {rji, r\f) denotes the mapping cu^(^i(co), ^ 2 (<^)) ^be space {Q, ®} 
into the product space {fC \ x ^ 2 ? ®2}}. 

Regular probabilities. As we pointed out previously, one cannot in gen- 
eral consider conditional probabilities as measures depending on an 
elementary event. However, in a number of cases this interpretation is 
valid. We now state the problem more precisely. 

The conditional probability P {A | "&} = h{co. A) is a function of coeQ 
and AeS defined for each fixed A only up to an event with probability 0. 
The question arises whether it is possible to find a function p{o), A) 
(coeQ, Ae(S) such that : 

a) for a fixed co the function p{o). A) is the probability on the <r-al- 
gebra ® ; 

b) h{(D, A)=p(co, A) almost surely for an arbitrary fixed A. 

Definition 4. If there exists a function p{co. A) satisfying conditions a) 
and b), then the family of conditional probabilities P {^ | S} is called 
regular. In this case P{^ | ©} is identified with /?(co. A). 

In the regular case the conditional mathematical expectations, as 
would be expected, are expressed by means of integrals with conditional 
probabilities serving as measures. 

Theorem 2. IfP {A | ©} w a regular conditional probability, ^ =/ (a;), then 



E{^|»} = 



/ (m) P {dcD I ©) (mod P) . 



(13) 



It is not difficult to prove this assertion. Firstly, the assertion is valid 
in the case when ^ is the indicator of an event Ae^.ln view of the lin- 
earity of both sides of equality (13) regarded as functionals of /, it follows 
that the assertion is valid also for simple functions. By taking the limits 
of monotonically increasing sequences of simple functions we prove (13) 
for an non-negative random variable Finally, repeated utilization of 
the linearity of both sides of (13) concludes the proof □ 
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In certain cases it would be necessary to emphasize that the condi- 
tional probability is a function of an elementary event. In such a case we 
write P{A \ ®}= A) or simply, P{a>, A) if the cj-algebra ® is fixed. 
Analogously, P^(cu, A) will serve as an alternative notation for P{A \ ^}. 

Since the property of regularity of conditional probabilities does not 
always hold, it seems desirable to somewhat extend this notion. 

Let {^, 93} be a measurable space, C be a random element in {^, 93} 
and 5 be a (T-algebra, 5 *= 

Definitions. The function Q{co, B) defined on 0x93 is called the 
regular conditional distribution of a random element C given a o-algebra g, 
provided the following is valid : 

a) for a fixed j8g93, Q(co, 93) is ^-measurable, 

b) for a fixed co, Q(co, B) is a probability measure on S with prob- 
ability 1 , 

c) for each 5 g®, Q{cd, ^) = P((CgJ?) | 5} (modP). 

The last requirement is equivalent to the following: for an arbitrary 
Fe^ 



J Q{oj,B)P{dcf) = P{{i:eB)nF]. (14) 

F 

Theorem 3. Let be a complete separable metric space, ^ be a o-algebra 
of Borel sets in be a random element in S} and g vin arbitrary o- 

algebra, S. Then there exists a regular conditional distribution of the 
random element C given the o-algebra g. 

Proof. Set q{B) = P[{l^eB]), BeSB. Using a theorem in measure theory 
one can find a compact set in ^ for any n such that q{9f\K^ <\jn (c.f. 
Chapter V, Section 2, Remark 1 to Theorem 1). 

Denote by ^(^) - where ^ is a metric space - the space of real- 
valued functions, continuous and bounded on ^ with the metric q{f, g) = 
= ll/W- 0 'Wll = {sup \f(x)-g(x)\, xe^}. 

The space is separable. Let {/„fc(x), 1, 2, ...} be a countable 

everywhere dense net in ^{K„). We extend /^^(x) to the whole in such 
a manner that {sup\f„^{x)\ xeS:} = {max\f„,,{x)\, xeK„}. Set Xn = Xn(0 
where x„(x) is the indicator of the compact set K„. It follows from the 
properties of mathematical expectations that there exists Dq g S such that 
P(Z)q) = 0 and for cd^Dq the following relationships are valid: 

if f^{x)^0 then E{/^( 0 | 5 }> 0 ; 

if \f«k{x)-f„j{x)\<r then E{|/„fc(C)-/„j(OI 
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E{/„.(C)±A(0 I S}=E{/^(C) I g}±E{/„,(0 I %}, 
limE{(l-xJ/„t(0| 5} = 0 

m-> 00 



for all n, k and j and rational r. On the other hand for an arbitrary Feg 



|(E{/(0|5}-EU.(C)|5})^ 



\f{^)-fnk{^)\q{dx). 



(15) 



Therefore if one chooses an arbitrary sequence such that 

l|%„(/(x)— /„fe„(x))|l-^0 (clearly, for any / g^(^) such a sequence can be 
selected), then the functions fnkS^) uniformly bounded and 

|/(x)-/„,„(x)k(rfx)^0; 

ar 



also in view of (15) 



E {/(O I g} = limE 1 5} (modP), 



where the limit on the right exists and is independent of the choice of 
the approximating sequence {cd^Dq). Since conditional mathematical 
expectations are defined only (up to modP), we stipulate that for an 
arbitrary / e^(^) E{/(Q | g} is defined by the last relationship. 

Using this definition, the conditional mathematical expectation pos- 
sesses the following properties (for all co^Dq): 

E{/(oim>o if f^o, 

E {oci /i (0 + OC2/2 (0 1 5} = aiE {/i (0 I g} + a^E {/2 (C) | g} , 
E{/(0(1-Z„(0|5}-0 as 

Thus, L^{f) = E {/(Q I g} represents a positive linear functional on ^ (^) 
and in view of a theorem on the representation of linear functionals (cf. 
Chapter V, §2, Theorem 1), there exists on {^, ®} a measure qoiiB) such 
that 



E{/(0|5} = 



f{x)q^{dx). 



It is easy to verify that this formula can be considered as a definition 
of the conditional mathematical expectation for an arbitrary ©-measur- 
able non-negative (or ^-integrable) function. Putting f(x) = XB{^)^ we 
obtain P{^ | 5} = ^co(^) for co^Dq and P{J5|5} = ^(^) for coeD^. 
Hence, the required assertion follows. □ 

Consider random elements and C 2 fo ®i} and {^ 2 ? ® 2 } 
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correspondingly, where ^ ^ is a complete separable metric space, ® ^ is the 
cr-algebra of Borel sets ^^(/= 1, 2). We set 

2}. (16) 

The sequence = ^ 2 ) can be considered as a random element in 

{(^(1, 2 )^ ®(c 2 )| (^( 1 . 2 ) jg ^ complete metric separable space. 

Let denote the distribution of Ci (i=l,2), - the distribution 

of and - the regular conditional distribution of C 2 given the 
cr-algebra generated by the random element Ci- Since is a 

^-measurable function, it follows that 

where jB 2 e ©2 and the function q{B2, y) is -measurable. It follows 
from the definition of conditional probabilities that 

j q(B2,C,)dP = q^^'^\B,xB2), 

giHBi) 

where is an arbitrary set in ©^ and Ci=9i{<^)- 

Using the rule of change of variables we can rewrite this equality in 
the form 






xB2)= q{B2,yi)dqi 






or 



q<^’^\B,xB2) = 



XB\’(yi) 



I XB2(y2)Q{dy2,yi^dqi, 



^2 



where are the indicators of the sets in the space It follows from 
the last formula that 






fiyu 3^2) q{dy 2 , yi) 



dqi 



(17) 



for any ©^^’^^-measurable non-negative functions. Indeed, the class of 
functions /, for which formula (17) is valid, is linear and closed with re- 
spect to passage to the limit of monotone sequences. Since it contains 
functions of the form if also contains their linear combinations. 

On the other hand, an arbitrary ©^^’^^-measurable function can be ap- 
proximated by monotonically increasing sequences of linear combina- 
tions of functions of the form 
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Note that formula (17) is valid also for functions / with alternating 
signs provided only one of the sides of equality (17) is meaningful. It 
follows from formula (17) that 

E {/(Cl, Cl) I go) = I /(Cl, yi) qidyi, Cl). (18) 

^2 

The result obtained can be presented in the following more general 
form. Let be random elements in where is a complete sep- 

s 

arable metric space. Set = /c = l,..., 5 }, yj^ = 

k=i 

= (Cl, C 25 • • • » Cs) let qj, be the distribution of the element Ck in {^k^ ©fc}, 

= Cl, •••, Cs-i) be the regular conditional distribution of the 

element Cs given cr-algebra = 

Applying repeatedly relation (18) we obtain from the formula 

E{/|5c.}=E{-{E{/|g,„_.}|g,„.J...|SJ 

the following relations: 



E{/(Ci,...,C„)|gc.} = 



/(Cl,y2,...,J'») 






X 



X /"’(^y», Cl, j'l, •••, yn- 1)) ^’(^yn-i, Ci, y2,---,y„-2) x 

x...xq^^^{dy2,Ci), (19) 

E{/(Ci,...,C„)} = | ••• j* fiyi,---,yn)q^"\dy„,yi,...,y„_i)x 

xq^”-^\dy„_i,yi,...,y„_2)x ...X (f^\dy2, yijqiidyi). (20) 

Conditional densities. Let ©, m} be a measure space and let C=g(oJ>) 
be a measurable mapping of {Q, S} into ©}. 

We say that a random element possesses density function ^(x) (with 
respect to measure m), if for an arbitrary Ag2I 

P(Ce/l) = J Q{x)m{dx). 

A 

In view of the Radon-Nikodym theorem a random element C possesses 
a density function iff the measure Pg is absolutely continuous with re- 
spect to m. 

Let {^, ©, q} be some other measurable space and let q = h{co) be a 
random element in {^, ©}. The pair ((, q) can be considered as a mea- 
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surable mapping of {Q, S} into the product space {dC x^, c7{9lx®}}. 
Indeed, if the set C in cr{9l x®} is of the form C = >1 x5, Ae^, Bg® 
then the events {{C,r])eC} = {CeA}n{rjeB}e(B, therefore any event in 
the minimal cr-algebra containing events of the form {(C, rj)eAx B}, i.e. 
any event of the form {(C, ?^)gC}, Cg(7{9Ix®} will be ®-measurable. 
Assume that the pair (C, rj) possesses density function q{x, y) with respect 
to measure mxq. Then for any Ae^ and Bg®, 

P{CeA,rjeB} = n q{x, y) m{dx) q{dy). 

A B 



The function q{x, y) is called the joint density function of the random ele- 
ments C and rj. The existence of the joint density function implies the fact 
that each one of the random elements C and rj possesses density with 
respect to the corresponding measure. Indeed, 



P(Ce^)= \ g(x,y)m{dx)q{dy) 



= 



m[dx). 



where 




Q{x,y)q(dy). 



Analogously, 

p(>/€B)=j* eniy)Q{dy), 

B 

where 



Qr,{y)^ Q{x,y)m{dx). 



3C 

We now show how to compute the conditional mathematical expectation 
E{/(^)|C} if {Lq) possesses the probability density ^(x, y). From the 
definition of the conditional mathematical expectation we obtain 

f E{f{ri)\C}dP= f f(ri)dP=Ef{q)xA(0- 



9~HA) 



g ^ (A) 



1 1 f{y)xA{x 



)Q{x,y)m{dx)q{dy) = 
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where 



Thus 



I (I I j'(0 dP, 

A <& g-'^{A) 

(& 



E{/(»/)|C}=| f{y)^~~q{dy). 



The quantity 



q{x, y) 



= g{y x) is called the conditional density function of 



the random element rj given ^ = x. Using this density the conditional 
mathematical expectation given C is computed by the formula 



E{/W|C} = J 



f(y) 0(3^ 1 0 q{dy)- 



§ 4. Random Functions and Random Mappings 



Definitions. Let {Q, S, P} be a given probability space. If the realization 
of an experiment is described by means of a function / (x) of a definite 
argument xeX, we say that a random function is defined on {Q, 0, P}. 
Thus a random function is the mapping: (u-^/(x) = /(x, cu), coeO. 
Additionally it is required that the function / (x, co) for a fixed x will be 
a random variable (or a random element). 

The general definition is as follows. Let A" be a set and ®} be a 
measurable space. 

Definition 1. A random mapping C(x) of a set X into a measurable space 
®} is called the mapping of XxQ into ^ which is, for arbitrary 
fixed X, a measurable mapping of {Q, S} into S} i.e. such that for 
any Be^ 

{co:C(x)e^}6® . 

In place of the term “random mapping” we shall henceforth use the 
term “random function with the values in here as well; x is referred 
to as the argument of a random function. In the case when A" is a real line 
or a segment of a line and the argument of the random function is inter- 
preted as time, we shall use letters T and t in place of X and x, respectively. 
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and we shall call the random function a random process. If the argximent 
of the random function takes on integral non-negative values [X=T^ = 
= {0, 1, 2, n, ...}) or arbitrary integral values [X=T= «+ 1, 

—1,0, the random function is then called a discrete 

parameter random process. If Z is a finite-dimensional Euclidean space 

or a region in then C(^) is sometimes called a random field. 

The following particular case of the general definition is of interest. 
Assume that is a functional space, co = co(x), xeX, and the cr-algebra S 
contains all the sets of the space Q of the form 

{(o:co(xo)€5}, 

for any XqeX and Be^ and P is an arbitrary probability measure on S. 
It is natural to associate the random function g{x, co) = co{x) with such 
a probability space. In certain problems it is convenient to identify the 
random function g{x, co)=(o{x) with the probability space {Q, P} of 
the type described here. 

It is easy to see that the general definition of a random function can 
be reduced to the above described particular case. Indeed, if a random 
function is defined as a function of two variables = o)) then 

by putting u=g{x, o>), where co is fixed, cogQ and denoting by U the set 
of all functions {m: w = w^(x) = ^(x, co), coeQ} we obtain a mapping S of 
the set Q onto U. Here the ex-algebra ® of the sets in Q is mapped into a 
(j-algebra $' of the sets in C/, and the probability measure P on S is 
mapped into the probability measure P' on S'. For any fixed x the set 
{u:u{x)gB}, JBe®, belongs to S', since 

S~^ {u:u{x)eB} = {a>\g(x, co)g^}gS. 

Thus a probability space {f/. S', P'} is obtained where C/ is a set of 
functions u = u{x) and for any «, Xj, X 2 ,..., x„ (x;^gZ, the 
distribution of the sequence of random elements on {Q, S, P} 

g{xi, co), g(x 2 , co),..., g{x„, co) 
coincides with the distribution of the sequence 

u(xi),u(x 2 ),...,u(x„). 

Hence, the random function is equivalent in a definite sense, to a 
certain functional space with a probability measure, i.e. to a probability 
space in which the space of elementary events is given by a certain set of 
functions. 

Let CW, xeX be a random function with values in S}, n is an 
arbitrary integer and x^ (k — 1 , 2, ...,«) are arbitrary points in Z. Consider 
a random element in ©"} determined by the sequence 

{C(Xi), C(X2),...,CW}. 



(1) 
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The corresponding measure PxiX 2 ... xni^) is 

Px,xa...x„(-S)=P{ffl:(C(xi), Uxi),--; CW)e5}, £e»". (2) 

The measures (2) are termed marginal distributions of the random 
function C (x). The family of marginal distributions of a random function 
possesses two obvious properties : 

1. = (3) 

where Be3S". 

2 : Let sbea pointwise mapping into acting according to the following 

rule: ^ 2 , = where ^ certain 

permutation of the indices (1, 2, and S the corresponding mapping 
of the sets into 

Then 



Ps(x.x,....,x„)(55) = Px.x....x„(5), (4) 

for any B and n. 

Properties (3) and (4) are called the conditions of compatibility for the 
family of marginal distributions. 

We now return to the general definition of a random function. The 
arguments concerning the actual indistinguishability of equivalent 
random variables discussed above are also important in the case of ran- 
dom functions. It is customary to assume that from a practical point of 
view an experiment allows us to distinguish only between hypotheses 
which refer to marginal distributions of a random function. 

Thus, one cannot, using experimental data, distinguish between two 
random functions which have marginal distributions coinciding for 
any n and x^eX. In this connection we stipulate the following 

Definition 2. The random functions C(^) and C'(x), xeX with the values 
in ^ defined possibly on two different probability spaces {Q, S, P} and 
{Q\ S', P'} are called stochastically equivalent in the wide sense if for any 
integer n^\ and any Xj^eX, fc=l, 2, ..., « their marginal distributions 
coincide : 

P { CO ; (c (xi), C (Xi), . . . , C (x„)) e 5 } = P ' {co ' : (c ' (x 1 , r (x„)) eB}. 

In what follows we shall often utilize the notion of stochastic equivalence 
of random functions in a narrower sense. 

Definition 3. Two random functions gi{x, co) and g 2 (x, (xeX, coeQ) 
defined on the same probability space {Q, S, P} are called stochastically 
equivalent if for any xeX 

P{gi{x, co)#02(x, co)}= 0 . 
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Clearly, if ^i(x, a;) and g 2 (x, co) are stochastically equivalent, then 
they are stochastically equivalent in the wide sense. 

How then can one define random functions in specific problems? 
Firstly, a random function can be defined using the general definition by 
explicitly stating the probability space {Q, ®, P} and the function C(-t) = 
=g{x, (d) both of which should be as simple as possible. Another method 
is to define a measure on a certain functional space U whose elements 
are functions on X and to consider the functions C(w) = w(x) on U. This 
method of definition and investigation is studied in Chapter V. The 
difficulty in this method is due to the complexity of the specific description 
of a measure in a functional space. This difficulty can sometimes be 
alleviated by considering a given random function C{u) = u{x) as a result 
of some transformation s defined on a more-or-less simple functional 
space with a relatively simple measure ja:u{x) = S[v^, vei^ where 
{Y', S, /i} is a measure space. 

Such an approach is discussed in Chapters 4 and 8 dealing with linear 
and nonlinear transformations of random functions correspondingly. 

The third method, possibly the most prevalent, of defining a ran- 
dom function is based on the description of the family of its marginal 
distributions. This is due to the fact that, firstly, in many practical prob- 
lems, random functions are characterized by their marginal distributions 
and the corresponding probability space is not usually given at all. 
Secondly, in many cases, it is simpler to define marginal distributions 
than the corresponding probability spaces and functions g{x, co). Next 
as it turns out, it is sufficient to know only the marginal distributions of 
random functions to solve many important problems. On the other hand, 
as it will be proved shortly, under very broad assumptions, given an 
arbitrary family of distributions Pxix 2 ...x„ i^) defined for any integer 
valued n and Xj^eX defined on ®"} one can construct a probability 
space {Q, ®, P} and a random function C{x) = g{x, (o) whose marginal 
distributions coincide with the given family. 

Definition 4. A compatible choice of distributions {P;ciX 2 ...JCn(^”)’ 
n = Xj^eX, where ® is a d-algebra of sets in the space ^ 

and ®” is its n-ih power is called a random function in the wide sense with 
values in 

The standard notation ^(x), rj{x),... is utilized for random functions 
in the wide sense, x is called its argument and the distribution of the 
sequence {^{xf),...,^(x„)} is identified with PxiX 2 ...:c„(^”)' H follows 
from the above that each random function in the wide sense (here ^ is 
a complete separable metric space and X is arbitrary) can be considered 
also as a random function in the sense of the basic definition 1 (cf. 
Theorem 2 of the present paragraph). On the other hand a random 
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function in the wide sense can be identified with a class of all stochastically 
equivalent (in the wide sense) random functions possessing the given 
marginal distributions. 

Construction of a random function according to its marginal distributions. 

In view of the definition of stochastically equivalent (in the wide sense) 
random functions the most typical feature for random functions is not 
the probability space or the form of the function g{x, co) = l^{x) but the 
family of its marginal distributions. This means that we can change 
(arbitrarily) the probability space and the form of the function g(x, (o) as 
long as the family of marginal distributions remains unchanged. This very 
important fact is widely used for obtaining a possibly simpler and more 
convenient representation of a random function. In this connection the 
following problem immediately arises : let a family of distributions 

n=\,2,..., x^eX, (5) 

be given where X is an arbitrary set and ®} a measurable space. Does 
a random function exist for which the given family of distributions is the 
family of its marginal distributions? 

Clearly, the family of distributions (5) cannot be completely arbitrary. 
It should at least satisfy the compatibility conditions (3) and (4). 

Definition 5. If there exists a probability space {Q, 0, P} and a function 
of two variables g{x, co) with the values in defined on XxQ, which is 
S-measurable for each fixed xeX and such that the marginal distri- 
butions of the random function g(x, co) coincide with the given family (5), 
i.e. for each 

P{co:(g(xi, co), g(x 2 , co),..., g(x„, co))e5<">}=P^, (6) 

then the probability space {Q, ®, P} and the function g(x, co) are called 
the representation of the family of distributions (5). 

It will be shown, that under sufficiently broad assumptions a com- 
patible family of distributions (5) admits a certain representation. The 
space Q is replaced here by the space of all functions defined on X with 
the values in ^ and the elementary events are functions in x, cjo = cd[x) 
and g{x, co) = co{x). 

Definition 6. Let Q be the space of all functions cjo = cd{x) defined on the 
set X with values in some measurable space ®} and The 

set of functions co{x)eQ for which the point {m(xi), ..., m(x„)} from 
belongs to i.e. the set 

= 
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is called the cylindrical set in Q with the basis over the coordinates 
Xi, simply the cylindrical set (or cylinder). 

A few remarks concerning cylindrical sets and the operations involv- 
ing these sets are in order. If n and the points ^ 2 , . . are fixed, then 
there exists an isomorphism between the cylindrical sets over the co- 
ordinates Xi, X 2 , x„ and the sets in each set determines a 

cylindrical set for which it serves as a basis; different cylin- 

drical sets correspond to different bases. The sum, difference and inter- 
section of cylindrical sets correspond to the sum, difference and intersec- 
tion of the bases. This follows directly from the definition of cylindrical 
sets. 

Considering now the operations on cylindrical sets in the general 
case one should keep in mind that the same set can be defined for different 
choices of coordinates. It is clear that 

It is also easy to see that any two cylindrical sets C= 

= can always be regarded as cylindrical sets over the 

same sequence of coordinates x'l, ..., x" containing x^, X 2 , x„ as well 
as xi, X 2 ,..., x^. It follows from here that when discussing algebraic 
operations on a finite number of cylindrical sets it can be assumed that 
they are defined on a fixed sequence of coordinates. Therefore the 
following theorem is valid: 

Theorem 1. The class G of all cylindrical sets forms an algebra of sets. 

Moreover, if X contains infinitely many points and ^ at least two 
points, then CC is not a cr-algebra. Indeed, the set 

u Cj{y,}), 

k= 1 

where /: = 1 , 2, . . . , is a sequence of points in is not a cylindrical set. 

We now prove the following theorem. 

Theorem 2. {Kolmogorovas). Let ^ be a complete separable space. The 
family of distributions (5) satisfying the compatibility conditions (3) and (4) 
admits a certain representation. 

We first define a set function P'(C), CgG on the algebra of cylindrical 
sets £ of the space Q by putting 

where C is a cylindrical set with the base over the coordinates 
Xi, X 2 , x„. The compatibility conditions guarantee the uniqueness of 
the definition of function P'(C). Let Q, /c= 1, 2, ..., n, be a sequence of 
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cylindrical sets. Without loss of generality it can be assumed that the 
sets Q are defined by the bases over the same sequence of coordinates 
Xi, ^ 2 , Xp. The same algebraic operations over the bases corre- 
sponds exactly to the algebraic operations over the sets Q. Since the 
measure xp (^^^0 countably additive on it follows that the set 
function P'(C) is finitely-additive on d. It remains only to extend the 
function P' (C) defined on algebra £ to a measure P on a certain cr-algebra 
For this purpose it is sufficient, based on the well known theorem on 

the extension of measures, to check that for an arbitrary Ced and any 

00 

cover {Q}, k = 1, 2, . . n, . . ., Qg(£ of the set C, C c IJ Q, the inequality 

k=i 

P'icH t P(Q) (7) 

k= 1 

is satisfied. 

We now show that if 

00 

U Q=C (CeCS;, Q€(£, k=l,2,...) 

k= 1 

and QnQ=0, (k^r), then 

00 

P'(C)= X P'(Q). (8) 

fc= 1 

The validity of (7) follows from here for an arbitrary cover of the 
cylindrical set C by means of the sets belonging to d. Now set 

C\ U = 

k= 1 

The sets D„ form a monotonically decreasing sequence of cylindrical sets 
with a void intersection 

00 00 

n 7)„ = C\U Q= 0 . (9) 

n=l fc=l 

n 

The equality P'{C)= ^ P' (Q) + P' (D„) follows from the additivity of P'. 

k= 1 

To prove (8) it is therefore sufficient to show that 

limP'(i)„) = 0. 



Assume the contrary that 

limP'(Z)„) = L>0. 

00 

Denote by the basis of the cylinder and let D„ be situated over 
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the coordinates Xj, X 2 , It is assumed here that as n increases the 
collection of the corresponding points X 2 , x^^ does not decrease. 
As it was shown above, this assumption does not restrict the generality. 
For each a compact set K„ (X„ci5„) can be found such that 

. . . Xmn ^ 2” + 1 ’ W=l,2, ... 

Let be a cylinder over the coordinates X 2 , with the 

n 

basis G„ = H Qr and let M„ be the basis of the set G„. Clearly M„ is 

r = l 

a compact set being an intersection of closed sets such that at least one 
of them, namely is compact. 

Since the sets G„ are monotonically decreasing it follows from the 
relation cu(x)gG„+p (p>0) that co(x)eG„. Therefore if 

then 



{jl, JmJeK- 

The sets G„ are clearly non-empty. Moreover, since 
D„\G„= U {D„\Q,)<= U {D,\QX 

r=l r=l 



then 



P'(Z)„\G„X £ P'(I>Aa)= £ (Br\KrH^, 

r=l r=l ^ 

thus it follows from here that 

lim P'(G„)= lim P'(D „)- lim P'{D„\G„)>^. 

n-^00 n-^oo n-^00 ^ 



Select a point 

from each one of the sets M„ . 

In view of the above for any k the sequence of the points 
n = 1, 2, ..., belongs to a compact set in ^ and the sequence 

r = 0, 1,2,... 

is in M„. By means of the diagonal process a sequence of indices Uj can 
be found such that for each k the sequence converges to a certain 
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limit yj^^\ Since the sets M„ are closed it follows that 
for each n. 

Define a function co(x) by putting co(xj^)=yl^\ k=l, 2, and 

supplementing the definition arbitrarily at other points. Then for each 

00 

n we have co{x)eG„czD„. Hence H is not empty which contradicts 

n=l 

(9). Therefore inequality (10) cannot be valid and 

lim P' (D„) = 0 . 

00 

Consequently function P' satisfies inequality (7) and can be extended 
to a full measure (£, P), Define the function g(x, co), coeQ, xeX 

by the equality g{x, (o) = (d(x). We have for an arbitrary Borel set 
in and any . . . , 

P {(g(xi, to), g(x 2 , o)),..., g{x„, a)))eB^''^} = 

= P {(Cd (Xi ), CO (Xj), . . . , CO (x„)) 6 . 

Therefore the required representation stated in the theorem has been 
constructed for the family of distributions (5). The theorem is thus 
proved. □ 
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Random Sequences 



§ 1. Preliminary Remarks 

Sequences of random variables (^ 2 ? •••? ••• be regarded as a 

discrete-time random process. Random sequences play an important 
role in the general theory. 

Firstly, many probabilistic problems involve the discrete parameter 
(time). 

Secondly, investigation of discrete parameter processes utilizes in a 
certain sense simpler methods, while these processes can be used to 
approximate or study arbitrary continuous-parameter processes. 

The basic problems studied in this chapter pertain to the asymptotic 
behavior of a random sequence as the value of the parameter increases 
to infinity. These are the problems dealing with the existence of limits 
of sequences, the behavior of the arithmetic mean, the asymptotic 
character of the distribution of the terms of a divergent sequence and 
so on. This class of problems interconnects the classical topics of prob- 
ability theory (laws of large numbers, limiting theorems for the sum of 
random terms, etc.) with the general theory of random processes. 

Evidently, to obtain significant results which differ from the general 
criteria of convergence it is required to impose definite restrictions on 
the random sequences under investigation. Correspondingly certain 
important classes of random sequences are introduced and investigated 
for which there are available nontrivial results related to the problems 
mentioned above. 

Let {O, S, P} be a fixed probability space and {^, ®} be a measur- 
able space. In this chapter T denotes either the sequence of non-negative 
integers T+ = {0, 1, 2, or an ordered set of all integers f= 
= {..., —1,0, 1, ..., /7, ...} (unless otherwise stipulated). 

The function {^(*) = (^(/) = ^(^, co), teT, coeQ} with the values in X 
is called a random sequence or a random discrete parameter process if, 
for any and teT, {w\^(t, co)g5}gS. 

The values ^ (t) are sometimes called the states of a certain stochastic 
system E and the space ^ the phase space of the system T. 
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In the case when ^ is a metric space, ® will always denote a a-algebra 
of Borel sets of 

Let ©*} be the s-th power of the measurable space ®}. For 
an arbitrary choice of integers ^2? • • •? 0^ < . . . < < oo the 

random sequence (t), teT} defines a probability measure 

where is a set in These measures are called marginal distributions 
of the random sequence. 

Marginal distributions completely determine in a certain sense, the 
corresponding random sequence. The precise meaning of this proposition 
is as follows. 

Consider the space of all possible sequences x = {x^, teT). 
Denote by Gq the algebra of cylindrical sets C in the space : 

= ^= 1 , 2 ,... 

The mapping co->x determined by the random sequence 

{<^(0, teT}-.x = {x„ teT) = {^{t, co), teTj, 

induces a transformation of the probability measure P into the probability 
measure P' defined on a certain cj-algebra (£' of the space containing 
all the cylinders. The measure P' coincides on the cylinders with the 
marginal distributions of the random sequence, namely 

and hence the marginal distributions uniquely determine the measure P' 
on a minimal o--algebra G (G c; (£') containing the algebra of cylindrical 
sets. 

To solve the problems arising in the theory of random sequences it is 
often sufficient to know the probabilities of events in C. There is therefore 
no reason to distinguish between random sequences {^i(t), teTj i= I, 2, 
with the same phase space {^, S} defined on different probability 
spaces PJ if the probabilities induced by these sequences 

coincide on the cylindrical sets of the space In this connection we 
shall agree to refer to a random sequence {^{t)\tET] stochastically 
equivalent to {^{t)\teT} and defined on P'} and such that 

^{t) = ^{t, x)=x^ as the natural representation of the random sequence 
musT). 

The rationale behind the introduction of the notion of a natural 
representation of a random sequence is the fact that in many problems 
marginal distributions are defined in a certain manner. On the other 
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hand, given an arbitrary family of marginal distributions and if ^ is a 
complete and separable metric space, one can always construct a random 
sequence in its natural representation whose marginal distributions 
coincide with the given ones. This is a direct corollary of Kolmogorov’s 
theorem (Chapter I, Section 4, Theorem 2). 



§ 2. Semi-Martingales and Martingales 

Definitions and basic properties. Martingales and semi-martingales form 
an important class of random sequences with numerous applications. 

To avoid repetitions in future, we present definitions relating not 
only to sequences but to more general random processes. 

Let T be an arbitrary ordered set and {5^, teT} be a current of a- 
algebras, 

Introduce the following notation 

=max(fl, 0), a~ =max( — < 2 , 0). 



Definition 1. The family 5^; teT) in which the random variables 
^{t) are g, -measurable for each teT is called a martingale if 





E|^(OI<oo, 




(1) 




|gj = ^(^), s<t, 


s, teT. 


(2) 


The family is called 


a sub-martingale, if 






Er(/)<oo, 




s<t, s, teT, 


(3) 


and supermartingale if 








8 

1 

LU 




s<t, s, teT. 


(4) 



Super and sub-martingales are also called semi-martingales. 

In certain cases, when the family of cx-algebras teT} is fixed and 
no confusion can possibly arise, we shall use the corresponding term to 
denote the family of random variables {^{t), teT] themselves. 

It follows from the definition that 5, always contains the a-algebra 
generated by the random variables {^(^), s^t}. Sometimes this a-algebra 
is taken as in the definition of martingales (semi-martingales). 

We now present a number of properties of martingales and sub- 
martingales. Since the replacement of ^{t) by —^(t) transforms a sub- 
martingale into a supermartingale the properties of submartingales are 
easily stated for supermartingales. 
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a) Relations (2) and (3) are equivalent to the following (s<t, s, teT): 




mP(doy), 



( 5 ) 






( 6 ) 



where is an arbitrary ^-measurable set. 

Indeed, (5) and (6) are obtained by integrating (2) and (3) respectively. 
It is easy to verify that conversely, (5) and (6) yield (2) and (3) correspon- 
dingly. 

b) If{^{t), teT} is a submartingale, then Ec^(^) is a monotonically non- 
decreasing function of t; if {^{t),teT] is a martingale, then E(^(r) is 
constant. 

c) If {^{t),^^, teT] is a submartingale, f{x) is a continuous and 
monotonically non-decreasing convex function on the real line and 
Ef{^{t))<ao for teT, then [f{^{t)), %,teT] is also a submartingale. 

This assertion follows from the definition of a submartingale and the 
Jensen inequality. Indeed, 

E{/(<^(0) I %s}>f{mt) I %s})>fm)- (7) 



In particular, 

d) If {^{t), gj, teT} is a submartingale, then {{^{t) — aY, teT} is 
also a submartingale. 

e) If {^{t), teT} is a martingale and f[x) is a continuous convex 
function and E\f(^{t))\<oo then {f{^{t)), teT} is a submartingale. 

For the proof it is sufficient to observe that the string of inequalities 
in (7) is retained in the case under consideration, the only difference 
being that the second ^ is replaced by an equality and that monotonicity 
of f{x) is not used in this case. 

Properties b) and e) yield 

f) If ^{t) is a martingale, then Elc^(/)| is monotonically nondecreasing 
on T. 

Lemma !• If T possesses a maximal element t^^^ and {^{t), teT} is a 
submartingale, then the family of the random variables [t), teT} is 
uniformly integrable. 

Proof. From the inequalities 
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I 



Bt 



P(rfco)< j r (i„ax) P{dco), N>0, 

Q 



where Bt = {co:^(t)> N} it follows that P(5,)^0 for N-^co uniformly in t. 
Therefore for any e>0 an Nq can be found not depending on t, such 
that 




Bt 






for N> Nq. In view of the inequalities in the previous string 

Exm>Nym<£ 

for all N> Nq which proves the uniform integrability of the family 
{C{t),teT}. □ 



Some inequalities. In the present section it is assumed that T = 
= {0, 1, 2, n), keT] is a monotonically non-decreasing sequence 
of (7-algebras, 0, T) is an g^-measurable random variable and 
k <00. Let Tj (/ = 1 , 2) be random times on {g^, keT}{Ti take on values 
from T) and let Tj with probability 1. Set 

Vi= Z ik 

k=l 

where ij = 0 implies rji = 0. 

Let gf denote the (j-algebra of events induced by the random time tj. 
Recall that (Chapter I, Section 1) g* consists of those sets Eg 0 for which 





^=0,1, 


(8) 


Lemma 2. Let 


EK.|g*-i}^0, k=l,2,...,n. 


(9) 


and ^ e gf . Then 


t]iP{dco)^ t]2P{dco). 


(10) 




J J 

A A 




If however 


Efeig,-i} = 0, 


(11) 
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then 



J = J r]2P{dco). 



( 12 ) 



Proof. We first note that 5* cig* (Chapter I, Section 1). To prove in- 
equality (10) it is sufficient to consider the case when is constant on 

n 

A, since in the general case A= ^ Ap where Aj — Ar\{x^=j]E^X- 

7 = 0 

Let Ti =7 on A Then Ae^j and X 2 ^j on A. We have 

f ri2P{dco)= [ rhP(dco)+ f f ^^P{d(o) = 

J J J fc=7+l 

A A An{T2>j} 

= ^r,,P(dco)+ I ij^,P{dco) + 



A 

+ 



^\{t2^7) 



ij^2P(doi)+.:+ j ^„P(rfco). 

Since A\{t 2 ^ fc} g 5^ (^ ^ j), h follows that 
^ ^k+ 1 P{dca)= j E{^ 

k+ 1 \%,}P{doy)^0, 

A\{z2^k} A\{T2^k} 



which yields (10). 

The argument presented above also shows that if inequality (9) is 
replaced by equality (11) then relation (12) will hold. □ 

Lemma 2 can also be formulated in the following manner: If 
keT} is a submartingale, Tj are random times on {5^, keT} and 
then for any 






c.ridoj.). 






A 



A 



Corollary 1 . If X 2 , is a sequence of random times on {5^, A: = 0, 
..., 1} and Ti^T 2 ^ ... z„ then {rj^, 5*, k=l,..., = forms a sub- 

martingale. If however {C^, 5^, keT] is a martingale, then gj, k= 1, 
..., .s} is also a martingale. 

Therefore martingales (semi-martingales) observable at random 
instants of time are martingales (semi-martingales) as well. 
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Corollary 2. If under the assumptions of Lemma 2 condition (9) is omitted 
and is replaced by assumption <oo where = 



n 

rii-T, 



k= 1 




ri2P{dco). 



(13) 



Indeed, (13) is a consequence of (10) if is replaced by □ 

Lemma 3. Assume that random variables t^re < oo, 

ksT} is a current of a-algebras, and C>0. Set 

Co = 0, C.= i k=h2,...,n. 

i=i 



Then 



P{max C,^C}<^E(C+e„), (14) 

n 

where Q„= "Z, ^k • moreover, for some p>\ 

k= 1 

E(C+^n)"<^ 



then 



E(max C*)'’<. , 

O^k^n \P— A 



m+Qn)”- 



(15) 



Proof Let be the smallest index k such that Ck'^^ (/c= 1, 2,..., n) 
and set =n if such an index does not exist. Let X 2 = n and A be the 
event {rj ^ C} where rj = max Cjt* 

O^k^n 

Here and T 2 are random times on 5^, A is -measurable and 
Ti :^T 2 . Applying (13) we obtain 



CP(^)<| C..P(rf®Xj (Cn + Qn)P{dcoHm +Qn), 

A A 



which proves inequality (14). Moreover, if /(C) denotes the indicator 
of event A, then 



rj^ = P x{C)C^-^dC. 



0 
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As we have just seen 

EQ(C)<E;f(C)(C +e„), 

therefore 



00 

Erj^ < pE 

0 



(C+0„) x(C) C’-^dC +Q„) r,^-^ 

p-1 



Utilizing Holder’s inequality we obtain 

p- 1 j, 

Erj‘’^-^{Eri‘’} " {E(C+e„n", 

p-1 

which yields (15). □ 

Corollary. If is a submartingale, then 

P{max C^C}^1 eC, (16) 

moreover ifE{l^^y<QO for some p>l, then 



E( max ) E(C)'’- (17) 

' \P~t/ 

Next, if {rjk, keT} is also a martingale, then 

pjmax|C,|^cU^E|CJ. ( 18 ) 

The numbers of crossings v [a, b) of the half interval [a, b) by a family 
of random variables {C(0> t^T] where T is an ordered set is defined as 
the least upper bound of numbers s such that there exists a sequence 
{ti, /= 1, 2, 2^} UeT, for which 

ah)^b, C(^2)<«, Uh)>b,..., CihsHa. 



We estimate the mathematical expectation of the number of crossings 
of the half-interval {a, b) by the sequence {Ct, k=l,2,.,.,n} defined in 
Lemma 3. 

Introduce the sequence of random integer- valued variables 71,72? 
...,7 fc, which may not always exist. Let Aj^ denote the event: the 
number jj, exists. Moreover, let be the smallest integer such that 
7i ; 72 be the smallest integer larger than 71 such that J 2 ^ n, ^ and 

so onj 2 m-i be the smallest integer larger than 72m- 2 such that j 2 m - 1 
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and Am be the smallest integer larger than 72m- 1 such that 

Ch^<a,j2m^n. 

The numbers •• fonu a monotonically non-decreasing 

sequence of sampling variables, 7^ ^ n. We extend the definition of j^. on 
the whole set Q by putting jk = n if cd^Aj,. It follows from the definition 
of jfc and the inequality ( 13 ) that 

0^ [ {Cj,„^,-b)P(dco)^ j* (Cj,^-b)P{d(o) + 

Aim-l A 2 m -1 



+ Z ^h^-,+kP{dco)^ia-b)P{A2„,) + 
k^l J 

Bm, k 

+ [ {Cn^-b)P{dco) + ^^ I idco). 



^ 2 m-l\^ 2 m -Bm, k 

where t = ^2m- 1 n {j2m ~j2m Hence 



(b-a)P{A2„H C iCn-b)P{dco )+ [ il,r_^,,PidcoH 
, k^l J 



A 2 m-l\A 2 n 



Bm,k 



< 



[ (c„~brp(dco)+'z f ii^r^_,^,p(dco). 

J k^l J 



A2m-l\A2m Bm,k 

Summing up these inequalities over all 1 , we obtain 
(b-a) Y. P(/l2j<E[(C„-h)^+a Q„=t 



Note that v[a, b)= ^ xi^im) where xi^) is the indicator of event A. 

m^l 

Therefore 

Ev[a, h)= Y P(^2m)- 
1 



We have thus proved the following 



Lemma 4 . If a sequence \ k=\^ 2 , .... n] satisfies the conditions of 

Lemma S, then 



Ev[a, b)^ 



E[(C„-fe)^+g„] 
b — a 



(19) 



In the case of submartingales the last inequality becomes 
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po, 

b—a 

These inequalities are easily generalized for the case when T is a 
countable sequence. Thus if r={l, 2, and T = {..., —n, 
— —1} then inequality (14) yields 

P{supC„>C}^^supE(C+e„) (21) 



P{supC„>C}<^E(Cli + 0'), (22) 

ner ^ 



00 

where e'= XI 

k=l 

The proof of these relations follows from the fact that supC„ = 

ner 

= lim max Ck^ and thus 

n^ao 1 < w 



P(supC„^C)= lim P{ max 

neT n~* oo l^k^n 

Analogously if [a, b) and v'^ \_a, b) denote the number of times the 
sequences {C„, neT] and {C„, neT] correspondingly cross a half interval 
[a, b) from left to right and v„ [a, b) and v'„ \_a, b) denote the number of 
times the truncated sequences (C^, 1, and {t^_j^,k = f...,n] 
correspondingly cross the half-interval [a, b\ then in view of the fact 
that v„ [a, b) and v'„ [a, b) form monotonically non-decreasing sequences, 
Voo b) = limv„[a, b\ v'^ [a, b) = \imv'„la, b\ and property f), yields that 

inequalities 

{b-a) Ev^[a, 6)<supE[(C„-6)+ (23) 

n 

(b - a) Ev'^ [a, 6) < E [(C _ 1 - fe) + + e'] (24) 

are valid. 

In the case of submartingales the same argument is applicable also 
when T is an arbitrary countable set of reals. Here one should intro- 
duce a monotonically increasing sequence of sets T„ consisting of a 
finite number of points, apply to the sequences {C{t), teT] the 
inequalities obtained and pass on to the limit as w^oo. 

We then obtain that 

supEC^ (0 

P{supC+(f)>C}< — 

(eT 



c 



( 25 ) 
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E[supC‘"(t)]'’^f-^) supE[r^(t)]'’, (26) 

ter \p—^’teT 

supE(C(r)-b)+ 

E V [a, b) < . (27) 

b — a 

Moreover if the set T possesses the maximal element then the least 
upper bounds in the r.h.s. of the above inequalities are actually attained 



Existence of the limit. Consider the sequence (C„, where T= 

= —«+l,..., —1, 0, 1,..., « is a monotonically non- 
decreasing family of c-algebras, Cn 5„-measurable and let moreover 

EC < o), = C„ - C „- 1 and E I g„_ 1 } = • 

Theorem 1. sl) If „ 

sup E {C + Q„) < 00 , = z . 

n^l k=l 

then there exists with probability 1 the finite limit Coo = 

00 

n 

b) if supEp' <00 where Qh= Y, there exists with probability 

n^l k=l 

1 the finite limit C-oo = 1™ C-n- 



Proof From Fatou’s inequality 

E lim Urn EC <00 

«-► + 00 + 00 

it follows that the relation ^ C = + holds only with probability 0. 

n-^ + 00 

On the other hand it follows from inequality (23) that 
lim Ev[a, Z?) = E lim v[a, Z?) = 0, 

a^-oo a->-oo 

where v {a, b) is the number of crossings of the half-interval {a, b) by 
the sequence {C„; «^1}. Therefore if Ci> — oo then with probability 1 
there exists a = a{(D) such that for all 1. Hence Im Cn> 

n-*^ + 00 



a.s. 

Assume now that the finite lim Cn does not exist with a positive 

«->■ + 00 

probability. Then, with a positive probability lim („< lim There- 

n-* + 00 + 00 

fore a pair of numbers a and b can be found such that with a positive 
probability 

lim Cn<a<b< lim 

n-»- + 00 n~* + 00 
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Then, however, the number of crossings v [a, b) of the half interval [a, b) 
by the sequence 1} is equal to oo with no lesser probability. This 

contradicts inequality (23). Thus with probability 1 Im lim Cn- 

«->+0O 7l-^+c0 

The proof of assertion b) is analogous. We merely note that if we let 

b^ + 00 in (24) then we obtain that lim v'[a, b)=0 with probability 1, 

+ 00 

where V\a, b) is the number of crossings of the half-interval [a, b) by 
the sequence — 1}. From here it follows that limC„< + oo with 

probability 1. All the other arguments given in the proof of assertion 1 
remain valid with obvious modifications and C-i takes the place of 

c„. □ 

When applied to semi-martingales the theorem just proved yields 

Corollary. If k= ... —n, —1} is a submartingale then 

the limit lim = C - oo ^^ists with probability 1. 5k? ^ = 1 ? 2, } 

n-^ — ao 

is a submartingale and supE^^ <oo, then the limit lim („ = C+oo 

n-* + 00 

ists with probability 1 . 

Definition 2. A random variable J {Q is called a closure from the right 
( left) of the submartingale (4 < oo ( E^“^ < oo), J is mea- 

surable with respect to %=^o{^^,teT}{^ is measurable with respect to 
5= n 5t) and for all te T 

feT 



Theorem 2. A submartingale {C„, «= 1, 2, ...} has a closure from the 

right if and only if the sequence w = 1, 2, ...} is uniformly integrable. 

Proof. If the submartingale {C„, « = 1, 2, . . .} possesses a closure from 

the right then putting T= { 1 , 2, . . . , } u {oo} 5^ = o- { 5„ , « = 1 , 2, . . . } , 

Coo = ^? we obtain that {Ct, 5t? teT} forms a submartingale and the set 
T possesses the maximal element oo. Therefore (in view of Lemma 1) the 
family « = 1, 2, ...} is uniformly integrable. Assume now that the 
family {Q , « = 1 , 2, . . .} is uniformly integrable. Since supE^^ < oo, there 
exists with probability 1 the limit ^=limC„. Let c^j)^ = max{C„, —N}, 
N>0. Since for any N the sequence «= 1, 2, ...} is a submartingale, 
it follows from the definition of submartingales that E <^^ x (^) < ml (^) 

for any y4e5„, m>0. Approaching to the limit as m->oo in the r.h.s. of 
the inequality and taking into account that the sequence « = 1 , 2, . . . } 
is uniformly integrable (being a sum of two sequences one of which is, 
by assumption, uniformly integrable while the second is bounded in its 
absolute value by a constant A and hence also uniformly integrable) we 
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obtain 



Eex(^)<EC„x(^) 

Approaching to the limit as N-^oo we obtain the inequality 

EC„z(n)^E^„x(^) 

moreover, ^Im EC^ < oo so that ^s a closure of the martingale. □ 

Theorem 3. Let (C„, ; « = 1, 2, ...} be a martingale. The following con- 

ditions are equivalent : 

a) the family «= 1, 2, ...} w uniformly integrable. 

b) the martingale {C„, «= 1, 2, ...} possesses a closure from the 

right 

c) EIC^-U-^O for n',n^^. 

If one of these conditions is satisfied then limC„ = C exists with proba- 
bility 1 and this limit is the closure from the right of the martingale in the 
sense of convergence in 

If for some p>\, E|C„|^^ C, then a), b) and c) are valid and C = limC„ 
in the sense of convergence in 5£ 

Proof The equivalence of a) and b) follows from the preceding theorem. 
The equivalence of uniform integrability and convergence in is a 
general result in measure theory. If the assertion stated in the second 
part of the theorem is fulfilled then the sequence is uniformly inte- 
grable. In view of (17) sequence possesses an integrable majorant 
sup|C„|^, and hence sequences and 1C — C«l^ where C = limC„ are uni- 
formly integrable. It follows from here that E|C„ — CI^^O for n-^co. □ 

Some applications. Let and g=cr{g„,«= 1,2, ...}, 

C be a random variable with E jCI < cb. Put 






Theorem 4. The sequence {C„, «= 1, 2, ...} a martingale and 

limE{^|g„}=E{^|g} 

with probability 1 . 

Proof We have (Chapter I, Section 3) 

I gJ=E{E{C I I g„} = E(C I = 

and moreover C„ is 5«-ineasurable so that the sequence {C„, « = 1, 2, . . .} 

is a martingale. Also 

E{E{<^|5}|g„}=ER|g„}=4- 
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The last equality means that E {<^ | g} is the closure of this martingale. 
The assertion of the theorem now follows from Theorem 3. □ 

Corollary 1. If then with probability 1 limP{y4 | 

n-* oo 

Let S} be a measurable space. The sequence of subdivisions 

{^^,^ = 1,2,. ..}« = !, 2,..., of the space ^ is called exhaustive if 

00 

a) = 0 for A:#r, U = «= 1, 2, ; 

k= 1 

b) the n + 1-st subdivision is a refinement of the n-th one, i.e. for any j 
^n + ij^^nk^or some k = k{j); 

c) the minimal a-algebra which contains all k=l, 2 ,..., n = 
= 1,2,..., coincides with ® . 

Corollary 2. Let {A„j^, /:= 1, 2, «=1,2, ..., be an exhaustive 
sequence of subdivisions of {^, ®} and m be a measure on ®, m (^)=\ . 
Denote by A„{x) the set A„j, which contains the point x. Then for any 
®-measurable and ju-integrable function / (x) we have 

J f{u)m (du) 

\im^ =/(x) 

n->oo m(A^(x)) 

m-almost for all x. 

The last assertion can be regarded as an analogue of the basic theorem 
of integral calculus for abstract integrals. The validity of this assertion 
follows from the fact that if is viewed as a a-algebra generated by a 
finite or countable number of sets {A„j^, k = l,2,...}, then n=l, 

f f{u)m{du) 



2,...} = ® and E{/ | ^ ^^^nk (Chapter I, Section 3). 

M^nk) 

Moreover, the right-hand-side of the last relation is not defined if 
m{A„f^) = 0. But m - the measure of the set of x’s such that the latter 
holds, if only for one n, - is zero. 

Using similar reasoning one can obtain a “direct” proof (in a certain 
sense) of Radon’s theorem on absolute continuity of measures. 

Lemma 5. Let {^, ®} be a measurable space and let the a-algebra ® be 
generated by a countable sequence of sets, ® = o- , B 2 , ...]. Then 

there exists in {^, ®} exhaustive sequence of subdivisions. 

Proof. Let the sequence {A^^,k=\,2, ...} consist of two sets B^ and 
B^. If {Af^k, /:= 1, 2, ...} is already constructed then the sequence {A„+ 
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A: — 1 , 2, . . . } is defined as a collection of all the sets of the type ^ 

and^„^n4+i (/:= 1, 2, ...). □ 

Theorem 5. Let ®, m] be a probability space, q{') be a measure on 
©}, ^(^)< 00 ; let the measure be absolutely continuous with respect 
to m and let k=l, 2 ,...}, n=l, 2 ,..., be an arbitrary exhaustive 
sequence of subdivisions of . Put 



g«(x)= 



m{A„^{x)) 



for m (A„f^ (x)) > 0, where A„f^ (x) is that set of the sequence {A„j , , /: = 1 , 2, . . . } 
which contains point x. If m{A^^{x)) = 0, we put g„{x) = 0. Then: 

a) the sequence «= 1, 2, ...} where ^^ = (j {A„^, A^ 2 ^ ...} forms 

a martingale ; 

b) there exists the limit g{x)= lim g^{x) (modm) independent (modm) 

00 

from the choice of the exhaustive sequence {A^, A: = 1, 2, 1, 2, ; 

c) for an arbitrary Be^ 

q(B) = jg{x)m{dx). (28) 

B 

Proof The function ^„(x) is 5n'i^^^surable and takes on at most a 
countable number of values. Therefore 

Cf Icr 1_V gi^n+lj) m{An+lj<^A„^(x)) 

^\9n+l\ On) — Zj ~ 

j M^n+lj) M^nki^)) 

9 + 1 j') ^ + 1 j') 9 (^/jk (^)) / \ 

j' ^{^n+1 f) ^ {^nk W) ^ i^nk (^)) 

+ 1 j' ^ -^nkix) 

which proves a). Furthermore 



l 3 „(x)| m{dx) = 



g^ (x) m (dx) = q(^)<cc 



ar 



a: 



and 






g„ (x) m {dx) = X! ^ (dx) = J^q {A^^ ^ g (>1) . 



AnAnk 



Since q is absolutely continuous with respect to measure m, for any 
6>0 a ^>0 can be found such that m{A)<3 implies q{A)<e. From 
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here it follows that the sequence g„(x) is uniformly integrable (with 
respect to measure m). Therefore there exists m-almost for all x the lim 
Qn (^) = 9 (^) 9 (^) is the closure of the martingale. Hence for each g 

g{x)m{dx)= lim gk{x) m{dx) = q{A^). 

J 00 J 

Since the class of sets A for which formula (28) is valid is monotonic and 
contains algebra U formula (28) is valid also for cr{5„, w = 1 , 2, . . .} i.e. 

n 

it is valid for any Finally assertion c) implies the independence of 
function g{x) from the choice of the exhaustive sequence. If there exists 

two functions g' and g" for which c) holds then '{x)-g"{x)']md{x) = 

B 

= 0 for any which is possible if and only if g'{x) = g"{x) (modm). □ 



§3. Series 

Some general criteria for convergence of series. In the present section 
conditions are investigated for convergence with probability 1 of series 
with random terms. 

Let the series 



^ 1 +^ 2 + ••• ••• • ( 1 ) 

be given. 

Theorem 1. If there exists a sequence of numbers e„>0, « = 1, 2, ... such 
that 



Z Z P{l<^»l>£n}<00, (2) 

n=l n=l 

then series (1) absolutely convergent with probability 1. 

Proof Let = {|^„| >£„}. From the convergence of the second series in 
(2) and Theorem 6 of Section 2 in Chapter I it follows that P(lmi^„) = 0 
i.e. with probability 1 only a finite number of events A^ can occur. There- 
fore there exists N = N{w) such that for n>N{co\ and series (1) 

converges. □ 

For random variables with finite moments one can formulate the 
following sufficient condition for the convergence of series (1): 
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Theorem 2. If 



E E|<J„|«x), (3) 

n=l 

then series (1) z.s' absolutely convergent with probability 1. 

The proof follows from Lebesgue’s theorem which assures that 

00 00 00 00 

EIC=EEC, EEC=ZEC, 

11 11 

and hence with probability 1 the series 

00 00 

I(C+C)=EI^J- 

1 1 



is convergent. □ 

Corollary. If there exists a sequence > 0, « = 1 , 2, . . . and p>\ such that 
the series 



n=l 



Z 



1 1 



1 1 5 

p q 



is convergent, then series (1) converges with probability 1 in as well. 

To prove this assertion we note that in view of Holder’s and Jensen’s 
inequalities we have 

m+n m+n 

Z E|^,|= Z 

m+1 m+1 

( m + n \l/4 /m + n \1/P 

z z c?(Ei 4 r < 

m+1 / \m +1 / 

( m + n \1/^ /m + n \l/p 

z Z , 

m+1 / \m +1 ' 

and the convergence of series (3) follows from this inequality by taking 
into account the premise of the corollary. □ 

Stronger results are valid for semi-martingales. Put 



C = ^l+^2 + -+^„, Co=0. 

Theorem 3. Let be ^^ = 0, 1, ...} be a current of a- 

algebras. Then: 

a) z/E I i} ^0 and supE^;^ < oo, series (1) converges with prob- 
ability 1 ; 
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b) I J = 0 and for some p>l 

SUpE|C„K<CXD, 

n 

then series (1) convergent with probability 1 in ^ as well. 

Condition a) is equivalent to the assumption that {C„, n = 1, 2, ...} 
is a submartingale. The corresponding assertion is therefore a corollary 
of the theorem on convergence of submartingales. Condition b) means 
that {C„, «= 1, 2, ...} is a martingale and the assertion of this part of 

the theorem follows from Theorem 3 of Section 2. □ 

CoroUary 1. \ J = 0 and 

00 

n= 1 

then series (1) converges with probability 1 in 5 £ 2 as well. 

The proof follows from the facts that for /c < n 

EC2 = e( Z Z E^fc+2 Z Z Z 

\fe=l / k=l j=2 k<j k=l 

and from assertion b) of the theorem. □ 

For series with independent terms the last result is known as Kolmo- 
gorov’s theorem. 



Corollary 2. (Kolmogorov’s theorem) If = 1, 2, ...} are indepen- 

00 

dent random variables, E^j^ = 0 and the series ^ V(^^<oo, then series (1) 

k= 1 



converges with probability 1 . 



This assertion follows from Corollary 1 if the tj-algebra generated by 
the random variables ^ 1 , (^ 2 ? is taken for and realizing that in 
view of the independence of the random variables 



E{^„|g„_i} = E^„ = 0. □ 



Series of independent random variables. We shall now discuss in some 
detail the convergence of series with independent terms. As we have seen 
previously such a series is convergent either with probability 0 or with 
probability 1 (Theorem 8, Section 2, Chapter I). 

The following bound on the distribution of the maximum of a sum 
of independent terms will be needed in the sequel : 
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Theorem 4. If k=\,2, .... n] are independent = 0 and \^k\<^ 
with probability 1, where c is a constant, then 

P{max + ai, (4) 

l^k^n fc=l 

where 

Denote by E„ the .^events {max n = l, 2,.... These events 

form a monotonically decreasing sequence. We have 

ME„)C„= t E(z(£.)a-z(£.-i)C."-i} = 

k= 1 

= t 1)- t E {x(£.- i\E,n. (5) 

k=l k=l 

Next, 

Ei{E,.,\E,%l=ME,-,\E^tU - 1 + iy < 

^ (t + cf E;^ (E^ _ i\£fe), 

t t Ez(£k-i\£k)=(f + ^)"[l-P(^n)]. (6) 

k=l k=l 

Moreover, 

Ez {Eu- m - Ck- 1) = Ex (£,- i)(2C,_ + ii) = 

= 2Ez(£,_ i) Ck- iHk + Ez(^k- 1 ) ^eu = crlExiE,. ,). (7) 
Relations (5), (6) and (7) yield 

t^^{En)>^X(En)Cn> Z ^k E^l^^k - l)" (^ + cf (1 - P (£j) ^ 

k= 1 

^ ^k+{t + C)^| -{t + Cf 

or 

(t + c)^^P(£„)| Z o-fc +^^^ + 2ct|, 

and relation (4) follows from here. □ 

In the general case of series with independent terms the problem of 
convergence of series (1) is fully solved using the following theorem. 

Theorem 5. (Kolmogorov’s three-series criterion). For the convergence 
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of series {\) of independent random variables with probability 1 it is neces- 
sary for each c>0 and sufficient that for some oO the following series 



Z P{l4l>c}, 

n= 1 

00 

Z 

n= 1 

00 

Z v^;, 



( 8 ) 

(9) 

(10) 



will be convergent, where ^'„ = ^„for |(^„|<c and ^^ = 0 for \^^\>c. 

Proof. Sufficiency. In view of Theorem 3, Corollary 2 the series 

Z ii'.-m 

«= 1 

is convergent with probability 1. Taking into account the convergence 

00 

of series (9) it follows from here that series ^ is convergent. It follows 

n=l 

from condition (8) and Borel-Cantelli’s lemma that only a finite number 

00 

of terms in series ^ — is non-zero. Therefore series (1) converges 

«= 1 

with probability 1 . 

Necessity. Let series (1) be convergent with probability 1. Then its gen- 
eral term tends to zero with probability 1 since only a finite number of 

terms of this series exceeds c (oO) in absolute value. Therefore the 

00 

series ^ is convergent with probability 1. Denote by {rj„} «= 1, 2, ... 

n= 1 

the sequence of independent random variables which do not depend on 

the sequence «=1,2, ... having the same distribution as (^'. Set 

00 

= Then the series ^ converges with probability 1, = 

n= 1 

|^„|^2c and V^„ = 2V(^'. It follows from the convergence of the series 

00 

Z 



Pi sup 

I 1 ^ /I ^ 00 



Z 



< oo> = 1 . 



Therefore 
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for some t. It follows from inequality (4) that for any n 



2 



E v^;=E 



(2c + t)^ 

a 



which proves the convergence of series (10). It now follows from Theorem 

00 

3, Corollary 2 that series ^ converges with probability 1. 

n= 1 

In turn it follows from this fact that series (9) is convergent. Convergence 
of series (8) follows from Borel-Cantelli’s lemma since if series (1) is con- 
vergent then only a finite number of terms of series (1) can be found with 
probability 1 such that |<^„|>c. The theorem is thus proved. □ 

Corollary. For convergence of series [\) of independent nonnegative ran- 
dom variables it is necessary that for any c> 0 and sufficient that for some 
oO the following series 

00 00 

E P{L>c}, E 

n=l n=l 

will be convergent. 

Indeed, for non-negative variables we have so that the 

convergence of series (10) follows from the convergence of series (9). □ 

Levy obtained an interesting result which states that convergence in 
probability of a series of independent random variables implies conver- 
gence with probability 1 . 

To prove this assertion an inequality is required similar to the in- 
equalities previously obtained for submartingales but without assuming 
the existence of any mathematical expectations. 

Theorem 6. Let 1, 2, ...} be independent random variables, C„ = 

= = = \,...,n, then 



P{max |Cfc|>2/}^ (11) 

l^k^n OC 

Proof We introduce the events = {|Ci|<2t, ..., ^2t, IC^I >2t}, 

B, = {\C„-Q^t},k=l,...,n. Then 



n 

k= 1 

where the events k=l, 2 ,..., n are pairwise disjoint and the events 
Tfc, Bk {k fixed, fe = 1, . . ., n) are independent. Therefore 

l-a>P(|C„|>f}^p| U 

U=i 
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= X P(A,)P{B,)>a X P{Au) = oc P{max |C,l>2f}, 

fc=l k=l l^k^n 

which implies (11). □ 

Theorem 7. If series (1), where 1, 2, ...} are mutually independent, 

converges in probability, then it also converges with probability 1. 

Let v4„ jy denote the event ] sup — Cn'l> — [• Series (1) diverges 

in',n"^n N) 

00 00 

on the set Z) = U O n- We bound the probability of D. Let s and rj 

N=1 n=l 

be arbitrary positive numbers. The convergence of series (1) in prob- 
ability implies the existence of nQ = nQ(s, rj) such that P{|C„ — („v|>e}<f/ 
for n', n" > Hq, Applying Theorem 6 to the variables C', = Cu-C„o,k>no, 
we obtain 



P{ max \Cu\>2e} = P{ max |Ct-CJ>28}<-^, 

no^k^n' no^k^n' 1 ^ 

where n' is an arbitrary integer greater than Hq. Therefore 

P{sup|C,-CJ>28H-^, 

no^k 1 TJ 

where rj is arbitrarily small. It thus follows that 



and P H =0 

l-t] V„=i / 



for any N. Therefore P(D) = 0 and the theorem is proved. □ 

Applications to the strong law of large numbers. Using a simple trans- 
formation, one can construct from the theorem on convergence of series 
with probability 1 theorems on the strong law of large numbers type 
(i.e. theorems on convergence with probability 1 of means of random 
variables). 



00 

Lemma 1. If the series ^ z„ converges and a„>0, a„ 

n= 1 



onically increasing sequence, then 



CO is a monot- 



k = 
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Proof. Let 

n 

^n— 'Tj ('^ 0 = 0 )? 

k= 1 

where c is a constant. Set a^ — a^_i= k = l, 2,..., ao=0. Then 



Z ^k^k=^ Z (^1+^2+ ••• +^fe)^fc= Z 



Therefore 



— Z ^k^k 


< 


Onk=l 





1 

k= 1 



+ sup |S„-St-iK 

no^k^n 



<2C^+ sup |S„-Sfc_i|<s 

no^k^n 



for any £>0 provided n and Uq are chosen sufficiently large. 

From the lemma just proved and the theorems presented in the pre- 
vious subsection the following assertions follow. 

Theorems, a) If {^„,n=\, 2,...} is an arbitrary sequence of random 
variables with finite moments of the first order and 

f a„=Ei„, 

n=l « 



then 



lim - I iik-a,) = 0 

00 ^ k — 1 



with probability 1. 

b) If w= 1, 2, ...} is a sequence such that the partial sums {C« = 
= + + form a martingale and for some p^\ 



supE 

n 




P 

< 00 , 



then with probability 1 



lim - X; ik=^- 

n-*oo ^ k= 1 



c) If {i„, n=\, 2, ...} are independent and 

1 



00 
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then 

lim - f ( 4 -E 4)=0 

n-oo ^ fc= 1 

with probability 1. 

For identically distributed random variables more powerful 
results will be obtained later as corollaries of general ergodic theorems. 

§ 4. Markov Chains 

Generalizing the notion of a random walk one can arrive at the much 
broader notion of a Markov chain (Markov process) which plays an 
important role in the theory of random processes. Before presenting the 
formal definition we shall discuss a simple but quite general model 
resulting in a Markov chain. 

Systems with random effects. Assume that a stochastic system 1 is 
considered such that the states of this system are represented by points 
of a certain measurable space ®}. Assume that the transition of the 
system from the state ^ (/) attained at time r to a new state at time t + 1 is 
completely determined by the value of t, the state ^{t) and some random 
factor aj which is independent of the state of the system I up to time t 
inclusive and which forms a process with independent values in time. 
Thus 

+ ( 1 ) 

where f{t, x, a) is a function of arguments tsT, aeA, where A is 
a measurable space. Using formula (1) one can express the state of the 
system I at any instant of time .y starting from the state ^{t) (r <^) ; 

<^ («) = + 1 > • • • . - 1 ) • (2) 

If at the initial time ^ = 0, (0) is independent of the sequence tsT] 

then ^{t) is independent of the sequence 1 , ..., a„, ...}. 

Let {Q, S, P} be a probability space on which the random elements 
are defined. Assume that for any fixed t and 5 ( 5 ' > /) the function ^ 

as_i) is ® X ^-measurable. Then if the motion of the system I 
starts at the time t and the state of the system ^{t)=x is known then 
formula (2) allows us to determine the probability that the system Z will 
find itself in an arbitrary set gS at the instant of time s>t. This prob- 
ability is called the transition probability and is denoted by P{t, x, 5 , A}. 
If Xa{^) denotes the indicator of the set A, then 

P{t, X, s, A}=ExA{g,,s{x, a,,..., 
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Let t<u<v. The equalities 

?{t,x,v,A] = £ij^ [gu,v{^ (w), a„, . . J] = 

E \_{^Xa\_9u, viy^) ^v- l)J}y = ^iu)~\ {^5 ^ 9 

follow from formula (2) and the independence of the variables a^, 

^t+l • ♦ • ^v- l * 

This can be denoted in the following manner: 

P(t, X, V, ^)“|* ^-f)’ (3) 

Relation (3) is called the Chapman- Kolmogorov equation. It expresses an 
important property of the system under consideration - the absence of an 
after-effect: if the state of the system in a given instant of time u is known, 
then the transition probabilities from this state are independent of the 
behavior of the system in the instants of time preceeding u. Systems with 
such properties are called Markovian (or Markov). They often occur in 
various problems of the natural sciences and engineering. It follows 
from the definition of transition probabilities that for any non-negative 
95-measurable function / (x) we have 

E/(0't,s(:x, a,, a,_i)) = | f(y) P(t, x, s,dy). 

Taking into account the independence of ^{t) from a„ a,+ i,..., 
(Chapter I, § 3) we thus obtain 

c 

E/(0t, . (<^ {t), a„ . . a, _ 1 )) = E f{y)P{t,^ {t), s, dy) . 

J 

The last formula is easily generalized for arbitrary S"*-measurable 
non-negative functions / (x^, X2 , . . x„), x^g^. Let t^<t 2 < Then 

= E/(<? (fl)> 1), l), a, l)) = 

= E I dy„)= 

= E I fiUh),--;Utm-2),ym-uym)x 

^ ^ (j'm— l9 3^m — l5 ^m9 ^yn^ ^ — ^ (^m— 2)9 ^m — I9 ^J^m— l) 




§4. Markov Chains 



75 



Thus 



^fdih), m,-, i(a=E I p(fi, iihi h,dy^)x 

xjp(t„ yz, t3, dy^)... I P(t„_i, y„-i, t„, dy„) yz. 




If we assume that the initial state of the system is non-random, (^(0) = x, 
applying formula (4) to sequence ^(1), {(2),..., (^(n) and to function 
/(xi, X 2 ?-- ? = where is an arbitrary set in S” we obtain a 

family of marginal distributions 2 , ...} defined by 

formula 



P%,...,„(B‘"’)= j- j Pi(^, dy,) PziYu dy 2 )...P„{y„-u dy„), (5) 

B(n) 

where Pfc(x, ^) = P(fc — 1, x, fc, .4) is the one-step transition probability. In 
the case when the initial state of the system ^(0) has an arbitrary distribu- 
tion m (where m is a probability measure on ®) we obtain from (4), in 
place of (5), the following system of marginal distributions: 

r f (6) 

= ... m(dx)Pi{x,dy3)Pz(yi,,dyz)...P„{y„-i,dy„). 

B(n+ 1) 

Moreover 






Formula ( 6 ) may be used as the basis of a general definition of a Markov 
chain. However, one should first analyze the meaning of the integrals of 
type ( 6 ) when the family of measures Pfc(x, A) is defined independently 
with no connection to the auxiliary variables and functions / (/, x, a). 

Stochastic kernels. Let two measurable spaces {^, 91} and {^, 95} be 
given. 

Definition 1. A stochastic kernel on {^, ©} is a function P(x, B) (x€^, 
Bg 93) satisfying the following conditions: 

a) for a fixed x the function P(x, •) is a probability measure of 95. 

b) for a fixed B the function P(*, B) is 9I-measurable. If P(x, •) is a 
measure and P(x, 1, then P(x, B) is called a semistochastic kernel. 

Lemma 1. Let /(x, 7 ) be a non-negative o-{9lx©} measurable function 
and P(*, ') be a stochastic (semistochastic) kernel on ©}. Then the 
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function 



is ^-measurable. 




f{x, y) P{x, dy) 



Proof For a fixed x the function/ (x, •) is ©-measurable so that the in- 
tegral appearing in the r.h.s. of the equality is meaningful. Classes of the 
non-negative functions /(x, •) for which the lemma is valid are cones, 
and, by virtue of Lebesgue’s theorem, are monotone classes. The K class 
of functions is called monotone if for 0^/i^/2<..., /,eK lim/,GiC. 
Since it contains indicators of the sets of the type AxB, where Ae^ 
and Be©, it therefore contains all the non-negative alSI x ©}-measur- 
able functions as well. □ 

The following assertion may be considered as a generalization of the 
well-known Fubini’s theorem. 

Theorem 1. Let {^, 21}, {^, ©}, {^, £} be measurable spaces, Qfx, B), 
be stochastic (or semistochastic) kernels on {^, ©}, 
respectively. There exists a unique stochastic (semistochastic) kernel 
Qslx, D) on {^, (t{© X K}} such that 

03 (x, 5 X C) = I ( 2 i (x, dy) 0,2 (y>C). ( 7 ) 

B 



Moreover for an arbitrary non-negative cr{© x (£} -measurable func- 
tion / (y, z) we have 



I 



f (y, z) Q3 (x, dy X dz) 






f (y, Qi {y. dz)]Q^ (x, dy ) . (8) 



To prove the first part of the theorem it is sufficient to show that 
formula ( 7 ) determines an elementary measure on the semi-ring of 
rectangles in the space ^x^. Let =Bi x C^, ^2 = ^2 x C2 and 
Then B2C1B1, C2c:Ci and =D2u£)'uB", where D' = 
^2 x(^i\^2) B" = (Bi\B 2) X Cj. 

The sets B2, D' and B" are pairwise disjoint. If we apply formula ( 7 ) 
repeatedly to the sets B2, B' and B", we obtain 

63 (x, £>2) + Q3 (x, D') + Q3 (x, D") = 

=j* Qi(x,dy)Q2(y,C2)+j Qi(x,dy) Q2{yi, Cj\C2)+ 

Bi B2 
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+ J* Qiix,dy)Q 2 iy,C)=jQi(x,dy)Q 2 {y,C) = Q 3 (x,Di). 

Bi\B 2 Bi 

Thus the function 63 (x, D) is additive on these special subdivisions 
of the set D3. In particular if D3 = DiuD2, where are rectangles and 
D^nD 2=9 then 63(^. ^i) + 63 ^2)^63 ^1) and x^)=l. 

(If and Q2 are semistochastic kernels, then Q3(x, ^ x 1 ). The 
additivity of the function 63 (x, •) on the semi-ring of all rectangles in the 
general case can easily be obtained using the induction argument. 

n 

Let D = U where are pairwise disjoint rectangles. Then 

k= 1 

n- 1 

D\D„ = D\jD" = U Dfe, where D' and D" are determined by the preceeding 

k= 1 

formulas. As it has already been shown 

63 (x, D) = Qi {x, D„) + 83 (x, D') + Cs {x, D") . 

Using the induction assumption we obtain 

"u Q,{x,D'nD^) 

k=l // fc=l 

and an analogous expression for Q3 (x, D"). Therefore 

n- 1 

63 (x, D) = 23 (x, D„) + X [23 (x, D'nD,) + 23 (x, D"nD,)-] . 

k= 1 

Since D' and D" are disjoint rectangles jointly covering it follows 
that D'nDk and D"nDj^ are also rectangles and {D' nDj^)u{D" nDk) = 
Therefore: 23 (x, D'nDfc) + Q3(x, D"nZ)^) = Q3(x, D^), and hence 

e 3 (x,I>)=i Q 3 {x,D,). 

k= 1 

We have thus proved the additivity of Q3(x, •). We now verify the 

00 

property of countable semi-additivity for 63 (x, •). Let U J^k = 

1 

= X Q, /c = 0 , 1 , . . . . Then 



03 (x, D') = QJ X, D'n 



Since 



00 

XD„(y,z)< E 



k= 1 



xody’ 2 )- 



zcjy, z)=Zfl.(y) xcjz), 
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it follows that 

00 

XBoiy) xcoi^H T. xbMxcA^) 

k=l 

Integrating both sides of this inequality with respect to measure Q 2 {y, ') 
over the space Jf, we obtain 



00 

Xbo (j^) Qi (y? ^o) ^ X Xbu (3^) Q 2 {y^ Q) • 

k=l 

Once more integrating the relation obtained with respect to the measure 
(x, •) over the space we arrive at inequality 

00 

Q3(x,Do)^Z 

k=l 

which shows that is countably semiadditive. From here it 

follows that Q^ix, B X C) admits a unique extension on o-{Sx£}. In 
order to prove formula (8) we first note that in view of the preceeding 
lemma the inner integral on the right-hand-side of (8) is a S-measurable 
function, so that the double integral in the r.h.s. of (8) is meaningful. Next, 
the class of functions / (/^O) for which formula (8) is valid is a cone and a 
monotone class. Moreover, in view of formula (7) it contains the indicators 
of the rectangles. Therefore it contains all the x G}-measurable non- 
negative functions. The theorem is thus proved. □ 

In the same manner one can prove the following theorem. 

Theorem 2. Let 91}, {^ 1 , 93J be measurable spaces 

and Qi{x, Q 2 [yu be stochastic (semisto- 
chastic) kernels, = (k—\,...,s). There exists a unique 

stochastic (semistochastic) kernel D) on D} where D = o’{©i x 

X ©2 X • • • ®s} such that 

(x, B^^^x ... xJB^">) = 

= 1 Qi{x,dyi) J Q2{yi,dy2)... 

B(2) 

Moreover for an arbitrary non-negative '^-measurable function /(y 1 , • • ■ , y„) 
I /(yi,---, ^iylX ... xdy,) = 

<3^1 X ... X 
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Remark. Formulas (8) and (10) were proved for non-negative functions. 
They are obviously valid for arbitrary / also provided only one of the 
functions /■*■ of/" is integrable. An analogous situation holds also in 
other theorems where for the sake of brevity only non-negative functions 
are mentioned. 

The kernel is called the direct product of kernels * ? Qs 

and is denoted as = XQ 2 X ...xQ^. 

If in (9) we put = ^ = ^ — ^ s-\ then a new 
probability kernel in { J", ®^} is obtained : 

x^2x ••• x^s-i (11) 

This kernel is called the convolution of kernels and is denoted as 

S*(‘")=Si* 22* •*Ss- 

We now apply formula (10) to function /(y^, y2.--->>’s)=/W = XB(»)(y'’) 
and compare it with (11). We thus obtain: 

I /(y.) {x, dyi X X ... X dy,) = 

X ^2 ••• X 






Since the class of non-negative functions for which formula (12) is valid 
is a cone and a monotone class, (12) is valid for an arbitrary non-negative 
©^-measurable function. In turn, it follows from here that for an arbitrary 
non-negative a x ©^^ x . . . x ©^^ x ©^} -measurable function of r -h 1 
variables of the form 



f(ym,,ym2,--;ymr^ys), hm6^m,0<mi<rn2< ... <m,<s) 



we have 



f(ym,,ym^,--;ymr’ys)x 



X X ... X X < 



X Q^^-^\x,dy„^xdy^^x...xdy,)= J Q*^^’""\x,dy^Jx 
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A particular case of formula (13) is the relation 



(13) 



*(l,s)_ + I,ni2) 



QMUs)^q 






H«(mr+ l,s) 



which shows that the convolution operation is associative. 

Consider infinite products of stochastic kernels. Let ®„} n = 0, 
1, 2, ..., be an infinite sequence of measurable spaces and P„(-, •), n=l, 
2,... be a sequence of stochastic kernels defined on ®„}. In 

accordance with Theorem 2 we construct direct products of kernels 

p(l,n)^p^xP2X...xP„, 



where (£„ is the minimal cr-algebra containing rectangles x ^2 x . . . x 
(B,e»,),CS:„=a{»ixS2X...x®„}. 

00 

We introduce the space with the elements being the 

n=l 

infinite sequences co = (x^, X2 , . . . . .) Denote by the algebra 

of cylindrical sets in and define on a family of set functions 
depending on the parameter Xq (xqE^q) as follows: if C is a cylindrical set 



we put 



C={co:(xo,Xi,...,x„)eD}, De^„ 
P^^\C)=P^^’’’^{xo, D). 



These set functions are uniquely defined. Indeed, if 



C={(o:{xo,Xi,...,x„)eD'}, D'e(£„- 



and if, for example, n' > n, then D' = D x + 1 x . . . x and 

Pi(xo,dxi) P2{xi,dx2}... 

afi X ... 

dx„) » v)> 

where ijy> (xi, X 2 , . . x^) is the indicator of D'. Noting that Xd' (^ 1 ? • • •? ^n)= 
= Zd(^i? •••5 ^n) that Pfe(x, 1, it follows from the last expression 

that 

P<^’’''^{xo,D') = P<^'”>{xo,D). 

The additivity of function on (Eq is obvious. 



p(i’">(xo,D') = 



I 

J •> 
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Theorem 3. There exists on £}, where (£ is a a-algebra generated by 
the cylinders of space a unique family of measures such that 

P'*o>{(u:x^eBfc,/c = l,..., n} = 

= 1 Pi(->Co, dXi)^ P2(xi,rfx2)... I P„_i(x„_2,^ix„_i)P„(x„_i, B„). 

Bi B 2 Bn -I 

Proof. It is sufficient to show that the measure introduced on (£q 

satisfies the continuity condition: for any monotonically decreasing 

00 

sequence of cylindrical sets C„ such that H C„ = 0 we have 

n=l 

Assume the contrary: that for some Xq; denote the bases 

of the cylindrical sets Q by the indicator of D„ by X2, 

situatcd ovcr the coordinates (1, 2 ,..., m„). 
Define the sequence of sets in ©, by 



= j* x{0„;Xi,X2,...,x^JP^^’'"’'\xi,dx2X ...xdx^J> 






where denotes the product of the spaces x x ... x 

Since C„ are decreasing it follows that are also monotonically 
decreasing. Moreover if is the indicator of and = 

= l-xW),then 

a: I 

X x(Dn) Pi (Xo^dXi) P<^’'"">(Xi, dX2 X ...dx^J^ 
^Pi(xo,B<i>) + ^ [ z(B<»)Pi(xo,^lxi)<Pi(xo,B!.»)+^. 



Therefore Pi(xo, Since Pi(xq, •) is a measure it follows that 

00 

n B<‘> =0. Let xi 6 B^i>, n = 1, 2, . .. . Then 

n= 1 ^ £ 

X{D„; Xi, X2,..., x„J P‘^-'""*(xi, dx2 x ... xdx„J>-. 

ar(2. 

The above arguments can be applied to the kernel P^^’'"”^(x2, dx^ x 
X ... xdXf^J and the measure P2(xi, dx2). 

This will prove the existence of a point X2 such that for any D„ 

g 

x(D„;xi,X2,X3,..., X„J P<^’’"">(x 2, dX3 X ... X 






^(3, m^) 
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We therefore construct a sequence X 2 , ...) where and 

such that for arbitrary s and 

X{D„, Xi, X2, . . x„ x,+ 1, . . x„J X 

3C(s+ 1, m^) 

X P**'" 1 X ... X t/x J>^. 

Consider an arbitrary set C^. Assume that its basis is situated over 
the coordinates (1,2, The last inequality shows that (x^, X2,..., 
..., Xg)eDk (otherwise we would have xi^k^ •••? ^mj = ^ 

for all (xs+ 1, . . ., xj). Therefore (xj, X2, . . x^, . . .)eCk for any Cj, and hence 

00 

Pj /0 which contradicts the initial assumption. The theorem is 

k=l 

thus proved. □ 

Corollary. Let a countable sequence of probability spaces ®„, 
n = l, 2,.,. be given. Let be the space of all sequences co = (xi, X 2 , 
x„, ...), x„E^„ and let d be the a-algebra generated by the cylindrical sets 
in There exists on Cf} a unique probability measure Q such that 

n 

Q{co:x^e5t,A:=l,2,...,n}= n Qki^k), 

k= 1 

In other words, if a sequence of probability spaces ©„, = 1 , 2, . . . 

is given, then there always exists a probability space {Q, S, Q} and a 
sequence of mappings /„ of the space Q into such that the random 
elements (^„= /„(m) have the given distributions q„ on and = 1, 
2, . . .} are jointly independent. 

Remark. The theorem just proved, unlike Kolmogorov’s theorem 
(Chapter I, Section 4, Theorem 2) does not require any topological 
assumptions on the nature of the spaces On the other hand it is less 
general than Kolmogorov’s theorem since it applies only to a special 
construction of measures in the product space. 

Definition of a Markov chain. Definition 2. A Markov chain with phase 
space {^, ©} is a family of random processes with discrete time teT^, 
depending on an arbitrary measure m on ©} which serves as a 
parameter, with marginal distributions defined by the formula 

k = 0,l,...,n} = 

= Jm(dx) JPi(x,dyi)... J P„{y„-i,B„), (14) 

Bo Bn -I 
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where {P^(x, B), t = l, 2, ...} is a system of stochastic kernels on ©}. 



The stochastic kernels Pf(x, B) are called one-step transition prob- 
abilities and the measure m the initial distribution of the chain. Fixing the 
measure m, we obtain a random sequence with values in ^ which is 
called a Markov process corresponding to the initial distribution m. 

The marginal distributions of this process are denoted by 
and the operation of taking the mathematical expectation of a certain 
function of the process with respect to probability measure P^"*^ will be 
denoted by the symbol E^. 

If measure m is concentrated at a fixed point x of the phase space we 
call X the initial state of the process and the marginal distributions, the 
measure in (£} and the mathematical expectation of a function of 
the process with respect to a corresponding measure and denoted by 
respectively. 

Put 



P{k, X, r,jB) = J Pk+i{x, dy^+i) 

3T 3C 



^k+liyk+U dyk+2}x--- 



... X 



1 






Pr-l{yr-2,dyr-l)Pr(yr-l,B). 



From an analytical point of view P(/c, •, r, •) is a stochastic kernel 
which is a convolution of the transition probabilities Pfc+i*Pfe+ 2 * - *Pr 
It is also called a transition probability. More precisely P(/c, x, r, B) is 
the transition probability from the state x during the time interval (/c, r) 
into set B. From the associativity of a convolution of kernels the equality 



P(/c, X, s, B)= P(/c, X, r, dy) P(r, y, s, B), k<r<s, 



(15) 



follows, which is the Chapman-Kolmogorov equation and formula (13) 
yields 




P(0, X, ti,dyi) 



P{ti,yi,t2, dy2)x... 



1, y2, ■ ■ ; y^) P(ts- 1, y.- 1, dy,). 



(16) 



We have thus obtained the same formulas as before (cf (3) and (4)), 
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but now they follow from the general definition of Markov chains. 
On the other hand the arguments presented earlier show that the random 
sequence {i{t\ teT+} obtained by means of the recursive relationship 

where 



(^(0), ai, 

are jointly independent variables and c^(0) has an arbitrary distribution 
m on ® under minimal assumptions on the measurability of the function 
f{t, • , • ), forms a Markov chain. 

Formula (16) allows us to render a more precise probabilistic meaning 
to the notion of transition probabilities. For this purpose we compute 
the conditional mathematical expectation of a non-negative function 
/(^(s), (^(s + 1),..., (^(s+ /i)) (here/(yo, is a Borel function of 

n + 1 variables) given (j-algebra 5[o,t] {t^s) generated by the variables 
<^(0), ^(1), ..., (^(t). The corresponding conditional mathematical ex- 
pectation is denoted by By definition, ^ is a unique 5[o,i]-ni^asurable 
random variable such that for any non-negative function g{xQ, x^,..., x^) 
the following equality 

e.3(«(o), ^(1),..., m)f(m ^(s+n))= 

is fulfilled. 

On the other hand it follows from (16) that 

(0), (1), . . ^ (0) / (s), (s + 1), . . (s + n)) = 

where 

f=/(^(t)) = j P(t,^(t), s,d>’o) J Ps+i(j'o><^>'i)x--- 

• •• X J'/(>'o, yj Ps+n(y^-i> dy„). 

Thus W =f. 

The formula obtained leads to the following conclusions. 

Theorem 4, The conditional mathematical expectation of an arbitrary 
non-negative function f{^{s),^{s-y\\,..,!^{s-yn)) given 5[o,f](^<‘^) does 
not depend on the initial distribution m, the transition probability preceding 
the instant of time t and the values (^(0), \ ^{t — \). It is given by the 
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expression 

E„{/(^(sU(s + l),...,^(s + n)|5[o,r]} = 






= P(t, ^{t),s,dyo) 



Ps+i(yo>‘^yi)x ••• 



•••x| /(yo.yi,---,y„)P s + n {yn-udy^. 



(17) 



The conditional distribution of the variables (^(s), c^(s + 1 + 
in given 5 [o,r] coincides with the direct product of the 

kernels 






In particular, the transition probability P(t, ^{t), s, B) coincides with the 
conditional probability of the system falling at time s into the set B if 
the states ^{0), ^{1),..., ^{t) are known. This probability depends only 
on the state ^ (t) at the last known instant of time and does not depend 
on the values of m ^{0), ^{1),..., or on the transition prob- 

abilities Pi(-, •)’P 2 (s *)?•••? Pr (‘ 5 *)• This last property of a Markov 
chain is called, as mentioned above, the absence of an after-effect and 
it is the basic qualitative characteristic of a Markov chain. 



Remark. Let a measurable space ®} and a system of stochastic 
kernels P„{x, B) n = l, 2, ... defined on it be given. Then there exists a 
Markov chain for which P„(x, B) are the one-step transition probabilities. 
The proof of this assertion and the construction of the corresponding 
probability space is given in Theorem 3. 

A Markov chain is called homogeneous if the one-step transition 
probabilities are independent of the time: 

P,(x, B) = P{x, B). 



In this case the transition probabilities for the time interval (t, 5 ) depend 
only on the length of this interval 

P{t, X, s, -®) = j* dyi) J P(>^1, dy2) x ... 

ar sr 

...x| P{y,^„B)P{y,_ 2 ,dy.-,) = P^^-‘\x,B). 



For a homogeneous chain the Chapman- Kolmogorov equation becomes 



pO + m)(^^ B): 



= 1 P<“>(> 



dy) P'-Hy, B). 




86 



Chapter II. Random Sequences 



Let a Markov chain be homogeneous. Formula (16) shows that 

+ + + = (18) 

where 



ms(.B) = J P(0, X, s, B) m{dx)= P^*^(x, B) m{dx). 



If quantity (18) does not depend on s for any function /(•), then a homo- 
geneous Markov process corresponding to a given initial distribution 
m is called stationary. For stationarity of a process it is necessary and 
sufficient that the measure m satisfy condition 



m{B)= P^^\x, B) m{dx) . 



(19) 



This condition is equivalent to a (seemingly) simpler condition 



m{B)= P{x, B)m{dx). 



( 20 ) 



Indeed (20) is a particular case of (19). If, however (20) is satisfied, then 
m{B) = ^ P(x, B) J P(y, dx) m{dy) = 

= 1 B) m(dy)=... = ^ B) m(dy,). 



Probability measures m satisfying (19) are called invariant or, more 
explicitly, invariant measures corresponding to a given stochastic kernel. 

Therefore if for a given stochastic kernel there exists an invariant 
probability measure, then there exists an initial distribution for a homo- 
geneous Markov chain to which a stationary Markov process corre- 
sponds. The given kernel serves as the one-step transition probability 
for this process. If the invariant measure is unique then there exists a 
unique stationary process for the given chain. 

Let denote the minimal a-algebra with respect to which the 
variables ^(0), ^(t) (r = 0, 1, 2,...) are measurable, let i be a 

random time on t = 0, 1, ...} and the domain of definition of t. 
Consider the following problem. Let ^{t) be a homogeneous Markov 
chain. How does the process ^^(t) = ^(t-ht) behave on It is natural 
to expect that under the hypothesis <^(t) = x the random process ^^{t) 
behaves exactly in the same manner as the Markov process ^{t) under 
the hypothesis (^(0) = x. We now state this assertion rigorously and 
prove it. (This property is called the strong Markov property). 
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Clearly, ^{t + z) is defined on and it follows from Lemma 5 of 
Section 1, Chapter I that (t'^0) is S-measurable. Put A) = 

= {Q^n{^{t)eA)}. Then v4) is a semistochastic kernel on 

S}. Indeed, 

P<^>(x,^)=f P(->{[T = 5]nK(5)€^]}. 

s= 1 

From here it follows immediately that A) is a measure on ® and 

On the other hand there exists a set such that the event {z = s} 

is equivalent to the event {<^(0), (^(1),..., ^{s)}eB^^\ Therefore 

P^^^{[t = s'] n[^{s)E A-]} = (^(1),..., ^s))EB^^^nA^^^}, 

where = ^ x ...x^ xA (the (s— l)-th factor equals ^), and it 

follows from the properties of semistochastic kernels that this prob- 
ability, as well as A) are ©-measurable functions. 

Denote by the cr-algebra induced by the random time t. 

Theorem 5. If and DczQ^, then 



P^^NDn n li{tk + T)eAjJ 



k= 1 



n [i{h)eA,-])P^^\x,D,dy), 



( 21 ) 



where 

P^^^ (x, D, A) = P^^^ (D n (t) 6 AJ) . 

Proof. Since Dc:^2^, it follows that 

P<^>|Dn^^n + f P<^>|£),n|^^n K(t„ + T)e/lJ^|, 

where D^ = D n[z = s]. Let xi^s) he the indicator of event D^. Taking 
into account the properties of conditional probabilities of a Markov 
chain (Theorem 4), we have 
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= E,|z(D,) K(4 + s)e A])}. 

In view of the fact that the chain is homogeneous, the r.h.s. of the last 
equality is equal to 

Ds 

= 1 P‘^’| P(s, X, D, dy), (22) 

3C 

where P(s, x, D, •) is a measure defined on ®} by the relation 
P(s, X, D, = (Dn [t = s] n [(^(s)g^]}. 

If we now introduce measure 



P^^\x,D,A) = P^^^{Dn[^{T)eA]}= ^ P{s,x,D,A) 

s= 1 

and sum up equation ( 22 ) with respect to s, we obtain the required 
assertion. □ 



§ 5. Markov Chains with a Countable Number of States 

Reducibility and irreducibility. Let be a finite or a countable set. In this 
case the set of all subsets of X will always be considered as the a-algebra 
of the measurable sets of X. Here arbitrary functions on X are found to 
be measurable. 

Points of the space X will be denoted by the letters /, y, . . . . Consider 
a homogeneous Markov chain with values in X. It is defined by the one- 
step transition probabilities p{ij), iJeX into singletons { 7 }. The one- 
step transition probability into an arbitrary set B is expressed in terms 
of p{ij) by the formula 

P{h^)= Z P(iJ) 

JeB 

and integration with respect to the measure corresponding to the sto- 
chastic kernel P (/, B) becomes summation 



fij) P{i, dj)= Z P{iJ) f(J)- 



jeX 



X 
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The expression for the w-step transition probabilities into singletons j 
becomes 



P'"'(U')= Z PiiJl)p(jlj2)--P(jn-lJ)- (1) 

jl, jl jn-ieX 

Introducing matrix (with a finite or infinite number of rows) whose 
elements are the n-step transition probabilities = we 

obtain from formula (1) that 

p{n)_pn 

where P" is the n-th power of the matrix P = which is the matrix 
of the one-step transition probabilities. The matrix P={p{iJ)} has the 
properties 

a) p{i,j)>0, b) Z P('>i) = l- (2) 

j^x 

A matrix P possessing properties a) and b) is called a stochastic 
matrix. It follows from the equality = that 

{ij)= I ih k) (3) 

keX 

On the other hand, formula (3) is the Chapman-Kolmogorov equation 
((15), Section 4) for this particular case. 

Definition 1. The state JgX is accessible from the state i if the transition 
probability from i to j in a number of steps is positive. If j is accessible 
from i and i is accessible from j then the states i and j are called commu- 
nicative. By definition the state i always communicates with i. 

The fact that i and j are communicative states is denoted symbolically 
by If j is accessible from i and k from /, then k is accessible from i. 

This follows from the inequality p^^\j,k). The 

relation is an equivalence relation : 

a) 

b) if i^j, then 

c) if i<^j and j^k, then i<r^k. 

Indeed, a) follows from the fact that p^^\i, /) = 1, b) follows from the 
symmetry of i and j in the definition of communicative states and finally 
c) follows from the fact that 

k)^p^^^{Uj)p^^^(j, ^)> 0 , 
i)^p^^^\kj)p^^^^{j, 0>o, 
if 







90 



Chapter II. Random Sequences 



An arbitrary Markov chain can be decomposed into disjoint classes 
of communicative states. This decomposition may be carried out as 
follows. Choose an arbitrary state and denote by the totality of all 
states which communicate with It follows from property c) of the 
relation that any pair of states in X^^ are communicative. If X^^ 
does not exhaust X we choose a state 12^^11 construct the class 
analogously. Since and /2 do not communicate the classes ^ and 
are disjoint. We continue the construction of the sets Zj^ until the whole 
space Z is exhausted. The classes Z^ so constructed possess the following 
properties : 

1) the number of classes Z^ is at most countable 

2) each element of Z occurs in one and only one class Z^. 

3) each pair of states in Z^ are communicative 

4) any pair of states belonging to the different classes do not commu- 
nicate. 

The last two properties can also be stated as follows : when given an 
arbitrary state i of a given class Z^ it is possible to reach with positive 
probability in a certain number of steps any other state of the same class. 
It may also occur that a system in a given class may leave it eventually 
but the probability that having left the class it will ever return is zero. 

Definition 2. A Markov chain is called irreducible if it consists only of 
one class of communicative states. If any state j accessible from i com- 
municates with /, then the state i is called essential. Otherwise it is called 
unessential* . 

It is easy to observe that only essential states are accessible from an 
essential state. Indeed, let i be essential and j be accessible from i. If k is 
accessible from j, then k is accessible from i and since / is essential i is 
accessible from k. But then j also is accessible from k, i.e. j is essential. 

The following corollary is thus valid: In a class of communicative 
states all the states are either essential or unessential. 

Recurrency. Let ^[ri) be a state of a Markov system at the instant of 
time n. Denote by Tj = Tj(«) the number of steps required by the system 
starting from time n to reach state j for the first time. Thus, Tj(«) is 
determined by the string of relations 

(^(«+l)#7,..., (^(«-hT^-l)^7, c^(« + T^.)=7. 

We introduce a family of a-algebras {5[„, q, ^ = 0, 1 , . . . } where is the 

minimal cr-algebra with respect to which the functions ^{n\ ^{n + l\..., 
^{n^t) are measurable. 



transient using Feller’s terminology. Translators Remark. 
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The variable Tj{n) is a random time for this family. Put 

= = j I <?(«) = /), -^=1, 2,... 

/<«>(/, y) = 0. 

Moreover 

Since the chain is homogeneous it follows that the probabilities 
do not depend on n. For these probabilities are called the 
probabilities of first passage through state j and for i = j the first recurrence 
probabilities into state i. 

The sum 

00 

F{Uj)=Y. 

s- 1 

is the probability that the system leaving the state i will eventually reach 
the state j. Analogously, F{i, i) is the probability that the system leaving 
the state i will return to this state in a finite number of steps. For F{i, j) < 1 
the r.v. Tj is improper. 

Definition 3. The state i is called recurrent* if F(i, /)=1, and is called 
non-recurrent if F(z, /)< 1. 

It is easy to establish the connection between the transition probabil- 
ities and the probabilities of the first passage. These are given by relation- 
ship 

(U')= I {i,j) (j,j\ (4) 

S=1 

where p^^fiJ) = Sij. Indeed, let Tj be the (waiting) time up to the first 
passage through j starting from a certain initial moment. Then 

P<">0.7) = P<‘ju [t, = 5] n K(n)=;]| = 

= i P"’ {[t, = s] n K(n)=j]} = 

s= 1 

= X p‘‘> = P« I T; = s}= f /<^> 

S=1 

Formula (4) is thus proved. We note its particular case: 

p*"UM)= i /<“'(«, (5) 

s = 1 

* persistent in Feller’s terminology. Translator's Remark. 
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which can also be rewritten in the form 

/<") {i, 0 = p<"> {i, i)- X 0, i) {i, i). 

S=1 

This last relationship allows us to calculate recursively the recurrence 
probabilities given the transition probabilities. Note that in order to 
evaluate the recurrence probability into state i it is sufficient to know the 
transition probabilities into that state only. 

We introduce generating functions Pij(z\ Fij{z) of the sequences 
{/><"> (i , ;•), n = 0, 1 , 2, . . . { /<"> (/, j),n = 0,1,2,...}: 

00 00 

Z p'"’ (ij) Fiji^)= Z (ij) 2"- 

n=0 n=0 

It follows from formula (5) that 

00 n 

p,.(z)=p<°> (i, i)+ X Z ('> 0 (i, i) 

M=1 fc=l 

00 00 

= 1+ Z Z (*’ 0 zV"“** (i, 0 z"“'‘= 

fc=l n=k 

00 

= 1+ X 

fe= 1 

or that 



P,(z)^l+P,(z) P,(z). 

One can interchange the order of summation and integration in the 
above calculations since the series are absolutely convergent for |z|< 1. 
The last formula can also be written in the form 



n(z) = 



1 

l-FiM 



( 6 ) 



The following equality can be derived from (4) analogously: 

Piji^) = Pjji^) piji^i ' 0) 

Let z be a real number and z|l. The functions Pu{z) and Fn{z) are 
monotonically increasing functions and in view of Abel’s theorem the 
limit limP^,(z) exists and moreover limP^^(z)^P^j (l) = P(c /). Set 

ztl zTl 

limP,,(z) = G(c/) = P,(l). 

zTl 



Using (6) we obtain the following: 
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00 

Theorem 1. The state i is recurrent if G(i, i)= ^ /)= oo and is non- 

n - 0 



00 

recurrent if G{i, i)= ^ /)< oo. In the non-recurrent case 



G(z, /) = 



1 

\-F[Uiy 



Theorem 2. If the states i and j are communicative then they are either 
both recurrent or both non-recurrent. 

Proof. Since i^j, one can find m^ and m 2 such that 
z)>0. Since 

then 



00 00 

n = mi +ni 2 n = 0 

Moreover the series G{j,j) is divergent if the series G{i, i) is divergent. 
Exchanging the roles of i and j we obtain that either both G(z, i) and 
G{jj) are finite or they are both infinite. □ 

Thus the recurrence property for a Markov chain is hardly the 
property of the state but rather the property of the class of communi- 
cative states. 

Intuitive considerations indicate that a recurrence during an infinite 
time interval into a recurrent state should take place infinitely often, 
while only finitely often into a non-recurrent state. These assertions can 
easily be proved. 

Let Qj {m) be an event that the system reaches the y-th state at least m 
times and let be the number of steps until the first passage through the 
state j. Then 

00 

Q.{m)=\J Qj{m)n{zj = n}. 

n= 1 

Let qij{m) be the probability of the event Qj{m) given ^ (0) = / . We have 

00 

?uH= Z P(CjHn[Tj=«] I <J(0) = /) = 

n= 1 

00 

E p*'* {Qj {m) I Xj = n) P<‘' (tj = n I ^ (0) = /) = 

n=l 

c» 

n= 1 
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It is easy to verify that 

(gj. (m) I Xj = n) = {Qj{m-l)) = qjj{m-l). 

Thus 

qij{m) = F{i,J)qjj{m-l). ( 8 ) 

Let gij = qij{oo) be the probability that the system after leaving the i-ih 
state reaches the j-th state infinitely often. Since qij= lim qij{m) it 

m-> 00 

follows from (8) that 

gij=F(iJ)^jj- ( 9 ) 

Theorem 3. If j is a recurrent state, then qij=^F{iJ) and in particular 
qjj= 1 if, however,] is a non-recurrent state then = 0 for every i. 

Proof If F{jJ)<l, we obtain that qjj = Ohy putting i=j in (9). Also from 
(9) we have qij = 0. If F{jJ) = l it follows from (8) that qjj{m) = 
= [F( 7 , 7 )]"*“^ = 1 and hence qjj=l. It then follows from (9) that qij = 
= F{iJ). □ 

Let F{i,j)=l. In view of the strong Markov property (cf Section 4, 
Theorem 5) we obtain 

P® (sn (tj + h) = A})) = P“> (B) P'^-> (^n (A) =A}) 

for any This relationship yields 

Theorem 4. If F{iJ)=\, then the random process {t) = j + t){^{])) = i) 

is stochastically equivalent to the process ^ (t) with the initial state ^ (0) = j 
and does not depend on the a -algebra 5^^.. 

Corollary. Let ^{0) = i, where i is a recurrence state, is the number of 
steps up to the first recurrence to i, ^2 the number of steps between the first 
and the second recurrence to i, and so on. 

The random variables (^ 1 , ^ 2 ? •••? ?« identically distributed and 
independent. 

Theorem 5. If the state i is recurrent and F{iJ)>^, then the system after 
leaving i will reach the state j infinitely many times {qij = l) and F[j, i)>0. 
In particular in this case F{iJ)=\. 

It follows from Theorem 3 that there are infinitely many returns to 
the state i. Let Q denote the event that the system will reach state j 
between its (A: — l)-th and k-ih recurrence to state i. In view of the strong 
Markov property of the process the events are independent and have 
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the same probability. Since P 



U Q 

. fe=l 



is the probability that the system 



eventually reaches state 7 , it follows that P(Q)>0 and ^ P(Cfc)= 00. 

k=l 

From Borel-Cantelli’s lemma we obtain that with probability 1 
infinitely many events Q will occur. Moreover if the system reaches 
state j it will then reach state i infinitely many times. 



Corollary 1. Only recurrent states can be reached from a recurrent state. 
Recurrent states are essential. 

This corollary sharpens Theorem 2 which was obtained previously 
using the method of generating functions. 



Corollary 2. In a class of communicative states containing a recurrent 
state, all other states are also recurrent and a point belonging to this class 
will, eventually with probability one, find itself in all the other states of this 
class and, moreover, this will happen infinitely often. 



A class of recurrent communicative states is called a recurrent class. 
Now set 



00 



G(iJ)= Z 

«-0 



The meaning of this series was explained for the case i=j. We now 
establish the following relation : 

lim Z ( 10 ) 

N-*' 00 n= 1 M = 0 



The proof is based on formula (4). Setting in (4) « = 1 , 2, . . . , iV and sum- 
ming up the equalities obtained we have 



Z ^‘”'(*>7)= Z Y Z 7) X 

n=l n=l s =0 s=On = s+l 

xp^"\j,J)= Z P^"^{j\J)Pn- 



whereFjv-s= Z and F{i,j) for N-* 00. Therefore 



Z p^"%J) , 
— = Z 



Z p^"\jJ) 
« = 0 
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The validity of formula (10) now follows from the following lemma 

Lemma 1. If{bn, « = 0, N) isa sequence of non^negative numbers and 

u 

^0, then we have for an arbitrary convergent sequence {c„, «= 1, 



N 



1 K 

s = 0 

2 ,...}: 



X ^k^N-k 



lim = lim . 



iV^oo 



I bk 

k = 0 



Proof If c = lim then 

N N-n 

X ^k^N-k X ^k(^N-k~~^) 

k=0 k=0 

M 0 



N 



X 



N 

X b. 



k=N-n+l , k=N-n+l 

--C w h 



X ^k^N-k 



k = 0 



N 

X ^k 



k = 0 



TV 

Z bk 



k=0 



(11) 



If the index is chosen so that for n''^n,\c — < e, where e > 0 is arbitrary, 

then the first term in the r.h.s. of equality (11) will be less than e. Since c„ 
are bounded, the second and third terms for fixed n also approach 0 as 

N 00 . □ 

This proves equality (10) since the conditions of the lemma are 
always applicable to the case under consideration in view of the fact 
that /?^"^(/,i) are bounded. □ 

From formula (10) we obtain 

Theorem 6. In a recurrent class G(/,j)= H-oo; if, however, j is not re- 
current, then G(/, 7 ’)< 00 for all i. 



Indeed if j is a nonrecurrent state, then the denominator in the left- 
hand-side of (10) tends to a finite limit, and therefore the limit of the 
numerator is also finite. If, however, j is recurrent then the limit of the 
denominator is oo, and if F(ij)>0 then the limit of the numerator is 
also 00 . n 



Periodicity. Note that if y”^(/, /)>0, then /)>0 also. Indeed 

/)^y"^(/, /)y”^(/, /)...y"^(/, /). Denote by d{i) the greatest com- 
mon divisor of all n such that y"^(/, /)>0. If y"^(/, /) = 0 for all 1 we 
shall assume that d{i) = co. 

Theorem 7. If i^ j, then d (/) = d (j). 
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Proof. Firstly if then d{i) and d{j) are finite. Let i)>Q. There 
are «>0 and m>0 such that and /)>0. Hence, 

yn+m+s)^y /) 7 ) > 0. Analogously > 

>0. Therefore d{j) divides {n^-m + 2s) — [n + m-\-s)=^s. It follows from 
here that d ( 7 ) ^d[i). Interchanging the role of i and 7 , we have d{i)^d ( 7 ) in 
view of symmetry, i.e. d{i) — d{j). □ 

CoroUary. In any class of communicative states the quantity d{i) is 
constant. 

In particular, for an irreducible Markov chain the quantity d=d{i) 
is independent of the state. 

Definition 4. If in an irreducible chain d—\, then the Markov chain is 
called aperiodic’, if d>\, the chain is called periodic and the number d 
is its period. 

The next lemma is a number-theoretic result. 

Lemma 2. Let d be the least common divisor of a sequence of positive 
integers n^, n 2 , n^. There exists a number mo>0 such that for all 
integer-valued m^mQ the indeterminate equation 

s 

md= X cpj 

J =1 

has a solution in non-negative integers Cj. 

s 

Proof. Let A be the set of all numbers admitting representation x = ^ ajnj, 

1 

where aj are integers (positive, negative or zero). Every x is divisible by d. 
Let do be the smallest positive integer belonging to A. Since x — kd^eA 
for any integral k, for any x a k can be found, such that x = kdQ. (Other- 
wise a ki could be found such that x^ = x — k^d^ would satisfy 0 < x^ < 
which contradicts the definition of d^) Thus do is the greatest common 

x:x= Yj p where bj are non- 

negative integers and let d^=Yj The number do can be represented in 

1 

the form dQ = N^—N 2 , where NieB. Let c be the largest integer-valued 
coefficient of Uj contained in A 2 . For any integer m > 0 we set m = k^/^ -I- 
where Then mdo = k(io^i provided kd^>m^c 

which is evidently satisfied if either k>dic/do, or m>dlc/dQ + d^. The 
lemma is thus proved. □ 



divisor of numbers in A. Next let B = 



Theorem 8. Ifd (/) <ao, an n^ can be found such that for n>nQ 

/)> 0 . 
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Proof. Let 1, 2, s) be a sequence of numbers such that 

i)>0 and let the greatest common divisor of the numbers U 2 , 
be equal to d(i). In view of the preceding lemma, Hq can be found 

s 

such that for n^riQ we have nd(i)= ^ Therefore 

k= 1 

Corollary. If /)>0, then for all n sufficiently large 

p(m + nd{i))y^ /)> 0 . 

Indeed 

When studying Markov chains it is often more convenient to investi- 
gate aperiodic chains first and then generalize the results obtained for 
periodic ones. 

We now show that the period of a state can be computed from the 
probability of the first recurrence. 

Lemma 3. The period of the i-th state coincides with the greatest common 
divisor of the collection of n such that /)>0. 

Proof. Let and Zjy be the set of n such that n^N and such that 
/)>0 and /)>0 respectively and let dj^ and d'j^ be their com- 
mon divisors. Clearly, ZjyCzZ^ and hence d'^^d^. Moreover d\ =di. 
Let there exist N such that = for n^N and d^^+^> ^/^y + i-Then 
z) = 0 and p^^~^^\i, i)>0. In view of the equality p^^^^^(i, i) = 

N 

^y’(iv+ ^ f^^^i, i) i) we have for some 5, 0<s^N, the 

k= 1 

inequality i) p^^^ ^ ~^\i, i)>0, i.e. ^ and N+ 1 —s are divisible by dj^ 
and therefore N+\ is divisible by dj^j which contradicts the inequality 
(i;v + 1 <z/^ + 1 The lemma is thus proved. □ 

Theorem 9. Any class K of communicative states of period d{d<co) can 
be subdivided into d pairwise disjoint subclasses Kq, ..., such that 

in one step from {s<d—l) one can move only to K^+i and from ^ 

only to Kq. Moreover, if ieK^,jeK^, then an N=N{i,j) can be found such 
that p^^^^-'^{Uj)>^for n>N. 

Proof. Let Kq be the set of all the states j such that we have p^^^^{iJ)>Q 
at least for one positive integer k where i is an arbitrarily chosen state 
in K. Then ieKQ. Since z and j communicate, an m can be found such 
that z)>0. The number m is a multiple of d. Indeed, z)^ 
'^p^^^^{ij) z)>0 and hence kd-\-mis divisible by d. Since m is di- 
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visible by d, Kq would remain unchanged if one were to take an arbi- 
trary j for which (/, 7) > 0 for some k in place of i in the definition 
of Kq. We now define as the set of 7, jeK, such that ^ p(i,j)> 0 , 

ieKo 

K2 as the set of all those 7 such that ^ j^K, and so on. It 

isKi 

follows from the definition of the sets that for any r and s. 

On the other hand, if JeK^, then 7o,7i, can be found such that 

r^s and /?(7V_ i,7V)>0, i.e. j)>Q. The converse is also 

true ; 7) > 0 , y,e A;, jeK, thence A; (since from 7, ^7,+ 1 ,7,+ 1^7 

and it follows that 7,.+ 1 ^7,.)- We now show that the classes and 
^ r < 5 < t/, are disjoint. Indeed, let je and jeK^. Then and (2 e Kq 
can be found such that (/i , 7) > 0 and p^^^ (/2 , j) > 0 . Since (2 and 7 com- 
municate, it follows that for some m /2)>0* Therefore h) 

>P^^\h^j) h)>^^ hence m+ 5' is divisible by d, i.e. m = kd—s, 

where s is an integer. But then 

0 <P^^\hJ)P^^\j\ h). 

which is impossible as it was shown above since the transitions from 
into z’2, /'i, are possible only for a number of steps which is a mul- 
tiple of d. Next, let ieK^ and jeK^. One can find m such that p^"^\i,j)> 0 . 
This m is of the form m = kQd+[s — r). On the other hand, in view of 
Theorem 8, p^^^\U O >0 for all n^rio(i). Therefore, 

/) 7‘)>0 for all n>riQ{i). 

The theorem is thus proved. □ 

We shall refer to the sets Kq, K ^, as subclasses of a periodic 
class of communicative states. 

The basic theorem of renewal theory. In order to study the asymptotic 
behavior of the transition probabilities /?^"^(/,7) as «-^oo we shall use a 
theorem which is often called the basic theorem of renewal theory. We 
shall confine ourselves to the version required in the sequel which, how- 
ever, is not the most general one. To explain the terminology assume that 
we are considering the performance of a piece of equipment which may 
fail from time to time. When a piece fails it is immediately replaced by 
a new one. The duration of the survival period of the «-th piece is a 
random variable taking on values 1 , 2 ,..., and the random variables t„ 
(« = 0 , 1 , ...) are mutually independent and identically distributed. Set 

00 

Pk=^{'^n=k), k=\,l,..., Y, p^=\. 

k=l 

The sum To + Ti - f ... +t„_i is called the instant of the n-\h renewal, 
while the variable t„ is called the duration of the «-th renewal. Denote 
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by G{n) the probability that n is an instant of renewal. The events 
{to=«}, {to + Ti=«},..., {to + Ti+ ... +Tfc_i=n},... 
are pairwise disjoint, therefore 

G(«) = P{to = «}+P{to + Ti ==«}+ ... +P{to + Ti + ... +^^-1=^2}+ ... 

and G{n)^l for n^l. Set G(0) = 1 . The function G{n) is called the renewal 
function. 

The theorem which characterizes the asymptotic behavior of G{n) as 
27-^00 is called the basic theorem of renewal theory. 

Denote by d the greatest common divisor of those n for which />„> 0. 
If d= \ the renewal process is called aperiodic; if d>\ the renewal process 
is periodic and d is called the renewal period. It can be easily seen that 
in the case of an aperiodic renewal G(«)>0 for all n starting with some 
tzq, n^nQ. If, however, d>\ then for all k sufficiently large, k^kQ, 
G{kd)>0. These assertions follow from the arithmetical Lemma 2. It 

turns out that if the renewal is aperiodic, then G^ = lim G{n)=—, where 

00 n7 

m = (for Et^ = 00, = 0 ). 

We first prove the existence of the limit G^ and then find its value. 

Lemma 4. Let t be a random variable taking on values « (22 — 0, ± 1 , ± 2, . . .) 
with probabilities p„, and let J{u) be the characteristic function of the vari- 
able t. If d=\ then J(u)^l for | 22 | <2ti, 2 //O. 

Proof We have 

00 

/(M) = Ee‘“'= X 

— 00 

Let/(wo)— I Wo I <271, 22o7^0. We have 

00 

0= 1 — Re/(Mo)= X! (1 —cos 222 / 0 ) 

— 00 

Therefore cos 222/0 = 1 those 22 such that p„>0 or nuQ = lnk. Select 

a sequence of integers n^, 222 , ..., n^ such that and such that their 

greatest common divisor is 1. Then n^UQ = 2nk^ {r=l, 2, s). On the 

s 

other hand equation ^ a^n^=\ has a solution in integral a^. Therefore 

r=l 

s s 

Uq=Yj ci^n^UQ = 27i Yj d^k^ = 2nkQ, 

r=l r=l 

where ko is an integer which contradicts the condition that \uq\<2k. □ 
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Lemma 5. If the renewal is aperiodic, then the limit = lim G[n) exists. 

n-^oo 

Proof. Set 

00 

G{z,ti)=Y. ^Pn(s), O^z^l, 

s = 0 

where jP„(5) = P(f/s = n) and ^s = To + '^i + ... +Ts_i for 5^ 1 and p„(0) = 0. 
It follows from Abel’s theorem in the theory of power series that 



G(n) = lim G(z, n). 

zt 1 

Since the characteristic function of random variable rj^ is equal to \^J{u)]\ 

[■^ («)]*= Z 

n=l 

where J{u) is the characteristic function of the random variable Tq, it 
follows that 



n 

— n 




Therefore 



G(z, n) = :I 

2n 



(* — in 



du 



n^O. 



1 —zJ(uY 

— K 

The integral in the r.h.s. of the last formula vanishes for n<0. Therefore 



, , 1 f cos nu du 

G{z, n) = - 



71 J 1— zJ(w) 



Set /i(z, w)=- Re(l — zJ(w)) ^ Since G{z,n) is a real-valued function, 

71 

we have 



G(z, w) = 



h{z, u) cos nu du. 



In view of the aperiodicity of the renewal and Lemma 4, the kernel 
/i(z, w) (ze[0, 1], 0<|w|<27r) is positive and continuous. Therefore for 
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any g>0 



G (n) = lim h (z, u) cos nu du + 

zTl J 

— £ £^|u|^7C 

Putting n = 0 here we observe that the limit 



h{l, u) cos nu du. (12) 



= lim 

ztl 



h(z, u) du 



exists and that h^^G{0). Since decreases as the lim/z£ = /io also 

£->0 

exists. Therefore the double limit 



lim lim /z(z, u 

£-►0 Zf 1 



;) cos nu du = h. 



also exists. Returning now to formula (12) we see that h(l, u) is an in- 
tegrable function (in the Cauchy sense) on the interval ( — tt, ti) and that 



G{n) = h-\- 



h{l, u) cosnu du. 



Since /i(l, w) is integrable, it follows from the Riemann-Lebesgue theorem 
that 



lim 

/1->CX) J 



h{l, u) cosnu du = 0. 



Therefore the existence of lim G{n) = h is proved. 

n~* <X) 

Theorem 10. If the renewal is aperiodic, then 

lim G(n) = — , m = ET^, 
n~* CO ni. 

moreover, //Et^ = oo, then lim G(«) = 0. 

00 

Proof. Since, in view of the previous lemma, the limit lim G[n) = h exists, 

n-^oo 

we will obtain, using Abel’s theorem on power series that 
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zTi V 



= lim Y z''{l—z) G{n) = lim{l — z)<P{z\ 

ztl n = 0 ztl 

aa 

where 0{z)= Y z''G(n) is the generating function of the sequence 
« = 0 

(G(n), ?t = 0, 1, From the fact that are independent and identically 
distributed it follows that G{n) satisfies the equation 



G(n) = d{n)-\- Y G(n — k)pj^, n^O 



(13) 



(^(n) = 0 for n>0, 5(0)= 1). Multiplying this relation by z” and summing 
up for all we obtain 

<^(z) = l+F(z)^(z), |z|<l, 

00 

where F{z)= Y Thus 

n=l 

^(z) = [l-F(z)]-^ 



and 



h 



= lim ^ 
zt 1 L ^ 



-F{z) 



If m = 00 , then for any N>0 
l-F(z) 



lim 



1-z” 



1 znn=l n=l 



From here it follows that h = 0. If, however m < oo then taking into account 
1 — z" 

the inequality < n for |z| < 1, we obtain 



1 — z 



l-F(z) » 1-z" » 

— =limXp„ j—-= Z Pnn 

ztl ztl 1 n=l 



= m. 



The theorem is thus proved. □ 

Corollary. If the renewal is of the period d, then 

\\m G{nd) = dlm, m = Ex^. (14) 



Indeed, if the given renewal is periodic and d is its period, then the 




104 



Chapter II, Random Sequences 



new renewal with the duration of = is aperiodic. If G'{n) is its re- 

Et ifi 

newal function, then G'{n) = G{nd). On the other hand Et^ = -— = —. For- 
mula (14) now follows from the theorem just proved. □ 

Limit theorems for transition probabUities. 

Theorem 11. Let be the transition probabilities of an irreducible 

recurrent aperiodic Markov chain. Denote by the average number of 
steps until the first recurrence of the i-th state. 



m, 



i= E 0- 



Then for each j 

i) = — . (15) 

00 m^ 

Proof. Let Tq be the number of steps until the first recurrence of the /-th 
state, Tj-the number of steps between the second and first recurrence of 
this state and so on. In view of the corollary to Theorem 4, the variables 
Tq, Ti, ..., T„, ... are mutually independent, identically distributed and 
take on integer-values greater or equal to 1, and, moreover, 

00 

p {T^=n} =/(»)(/, /), ^ /<”*(/, j) = l. 

n= 1 

The mathematical expectation of the variables is equal to m^. Con- 
sider a renewal process in which is the duration of the «-th renewal. 
The quantities p„ and G{n) are replaced here by i) and i) re- 
spectively. Since the chain is aperiodic, in view of Lemma 3, the renewal 
is also aperiodic. 

From Theorem 10 the following equality is obtained 
lim /)=— . 

n-oo mi 

This is a particular case of (15) for j=i. The general case can now be 
dealt with easily. Using formula 4 we obtain 

k=l k=l 

n 

Noting that i)^0, ^ i)-^l as n-^co (in view of the irre- 

k= 1 
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ducibility and recurrency of the chain) and applying Lemma 1 we obtain 
equality (15) in the general case. □ 

We have just proved the ergodic theorem for Markov chains. More 
on ergodic theorems is presented in Section 8. 

Theorem 12. If an irreducible recurrent Markov chain is periodic with 
period d, then 

lim (/ , i) = — . (16) 

00 nti 

If are subclasses introduced in Theorem 9 and ieK^JeK^, then 




l = s — r{modd), 
l^s — r{modd). 



(17) 



Proof It follows from Lemma 3 that the period of an irreducible Markov 
chain coincides with the period of the renewal process introduced in the 
proof of the previous theorem. Therefore equality (16) follows directly 
from the corollary of Theorem 10. From Theorem 9 we have: 

= 0 for isK„ jsK, 

and l^s— r {mod d). Therefore if, for example, r<s, then 



k = 0 

Referring to Lemma 1 the proof of formula (17) is completed as in the 
proof of Theorem 11. □ 

Definition 5. A recurrent state j is called a null state if lim = d 

n-*^oo 

and is called a positive state if lim 

n-^ CO 

In a recurrent class of states the states are either all positive or all 
null. Indeed, if then it follows from inequality i)^ 

^p^"^\ij) 0 (where m and ^ are such that p^^^{ij)>0, 

0^^) lhat 

lim (/, i) ^ lim (y, y) , d=di = dj . 

Interchanging the roles of i and y we obtain the proof of the assertion. 
The results obtained can be summarized as follows : 

Theorem 13. a) In order that the state j be non-recurrent it is necessary 
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00 

and sufficient that Gjj= ^ p^”\j,j)<co. Moreover for all i 

w = 1 



^ij= z 






lim (/, 7) = 0 . 



b) Let j be a recurrent state with period d and the mean return time mj. 
If i is accessible from 7, then i is also a recurrent state with the same period 
d, and is either null or positive depending on whether j is null or positive 
and there exists k, 0^k<d, depending on i and j only, such that 



lim 7) = 



rtij 

0 



if r = k, 
if r^k{modd). 



(18) 



c) If the states i and j belong to the same recurrent class then 

1 ^ 1 
Tt Z y"H67') = 






m: 



(19) 



The last assertion is a direct consequence of b). On the other hand 
unlike assertion b) formula (19) does not reflect the distinction between 
periodic and aperiodic classes of states. An irreducible recurrent Markov 
chain is called positive (null) if its states are positive (null). 

Criteria for recurrency. Stationary distributions. The property of a Mar- 
kov chain’s being recurrent (positive or null) is closely connected with 
non- trivial solutions of the system of linear equations. 



Y, piJ,i)xj=Xi iel, (20) 

jel 

and its transpose 

Yp{iJ)xj=Xi iel (21) 

j<^i 



If the system (20) admits a non-negative and summable solution, i.e. 
Xi ^ 0 and ^ < 00, we may then assume that ^ x,- = 1 and such a solution 

may be interpreted as an invariant initial distribution Xj = P{(^(0) = i} = 
= P (1) = 1} = . . . , which generates a stationary Markov process. On the 
other hand, the existence of a stationary Markov process with given 
transition probabilities is equivalent to the existence of a non-negative 
summable solution of the system (20). 

As far as the transposed system (21) is concerned the existence of a 
non-trivial solution x^ = c for this system is evident. A characteristic 
feature of a recurrent Markov chain is that (21) does not admit other 
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non-trivial non-negative solutions. Moreover the following theorem 
holds : 

Theorem 14. An irreducible Markov chain is recurrent if and only if the 
system of inequalities 

Z (22) 

jel 

admits no non-negative solutions other than solutions of the form = 
iel. 

Proof. Assume that a chain is recurrent and ^ 0 and x, {iel) constitute 
a solution of the system (22). We choose an arbitrary Xj > 0 (if there is 
no such Xf, then all x^ = 0). It follows from (22) that 

Z Z p{j^ Z 

jel ke I ke I 



and by induction 

Z k)x^. 

kel 

For each i, an n can be found such that /)>0; therefore 

X- 

Xi^p^"\i, 1) Xi>0. Thus Xj>0 for all iel. Set yt=—, where / is an arbi- 

x^ 

trarily chosen state. We have ^ p{hj) yj^Pih 0+ Z Pi^^j) yy ^P' 

jel 

plying this inequality to the quantities y^ appearing in the r.h.s., we obtain 

Vi^pih l)+ Z p{i’j)p{j’ 0+ Z Z p{uj)p{j^ k)y^ = 

ji=l j^l ki^l 

k^i 

where ^)= Z P{^J)p{J’ the probability of hitting the k-th 

j*i 

state on the second step, after leaving the z-th state, and not entering the 
/-th state. Iterating this method we arrive at the inequality 

Z /*"’(^0+Z 

n=l k^l 

where (/, k) is the A-step transition probability from the z-th state 
into the A:-th state not entering the /-th state. Approaching A-> oo in the 
last inequality we obtain 

CXD 

Ji>Z 0=1, 

n=l 

i.e. Xi^Xi. 
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Since i and / are arbitrary integers, it follows that = = const, i.e. 

the system of inequalities (22) admits no non-negative solutions other 
than Xi = c, iel for which the sign of equality holds in all the equations 
of system (22). 

Now let the chain possess at least one non-recurrent state (the irre- 
ducibility of the chain is not utilized here). Set = 1, Xi = F{i, 1) for /#/, 
where / is an arbitrary non-recurrent state. Note that F{i, l)=l holds 
not for all i, Indeed, otherwise we would have 

F{1, l)= E p{U k) F(k, l)+p{l, /)= E p{l, k)=l , 

ki^l kel 

which contradicts the fact that the state / is non-recurrent. Thus the non- 
negative numbers Xj defined above are not all equal. We have for /// 

Xi = F{i, /)= E P(i, k) F{k, l)+p{i, /)= E P{k k) x„ 

k^l kel 

and 

X, = 1 > F(l, l)='Z p{i, k) Xfc , 

kel 

i.e. {Xj, iel} forms a non-negative solution of the system (22) which is 
not a constant. The theorem is thus proved. □ 

We now investigate the connection between the existence of invariant 
initial distributions and the recurrence properties of a Markov chain, i.e. 
we shall study the problem of solvability of system (20) for a recurrent 
chain. 

Theorem 15. Let a Markov chain be irreducible and recurrent. The system 
of equations (20) can have no more than one solution satisfying the condi- 
tions 

E kl<oo, E ^i = l- (23) 

iel iel 

If the chain is positive-recurrent the solution of a system (20) satisfying re- 
lations (23) is of the form 

Xi-=Vi = lim i E 0- (24) 

N-*oo 2 V M = 1 

If however, the chain is null-recurrent, then the only absolutely summable 
solution of system (20) is the trivial one (Xj = 0). 

Proof We first prove the uniqueness of the solution of system (20) under 
conditions (23). Let such a solution exist. Multiplying (20) by p{i, k) and 
summing up over all i, we obtain 

Xk=Y. ^iP(k ^) = E E ^jPU^ i)p(i, k) = 

iel iel je I 
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= E E /'O’ O/'O ^)= E ^)- 

j e / i 6 / je I 

The interchange of order of summations is permissible since the corre- 
sponding double series is absolutely convergent. Analogously we obtain 

E (25) 

jsi 

Set 

n= 1 

then 



Xk=T k), 

j^i 



Taking into account the fact that %(y, k)-^my_ ^ and the absolute con- 
vergence of the series ^ Xj and approaching the limit in the last equality 
j^i 



we obtain 



^k=Y. (26) 

Jsl 

which proves the uniqueness of the solution of the system (20) (23). It 
also follows from here that if the chain is null-recurrent, then ^^ = 0 for 
all kel. 

We now prove that for a positive-recurrent chain the quantities (24) 
constitute the required solution of system (20). Let /' be an arbitrary 
finite subset of /. It follows from the inequality 

jel' 



that 



1 N 

^N+x{k, ^N{k,j)p{j, i)- 

Approaching to the limit with N-^oo we obtain that 

E '^jpU’ ')• 

jei' 

Assuming now the we obtain ^ Vjp(j, i). Multiplying the last 

jei 

inequality by p{i, k) and summing up with respect to k we arrive at the 
inequalities 

E ";/'('’ ^)^ E i'i/'‘^’(6 k) 

iel iel 
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and continuing this process we arrive at the inequalities 

k) 

iel 

for any n^l. If there were a sign of strict inequality for at least one k 
in the last relation, we would have 

Z Z Z k)= Z Vi, 

ke I iel kel iel 

which is impossible. Therefore 

Vk='ZviP<">{i,k), kel, n=l,2,... (27) 

i e I 

In particular the quantities form a solution of the system (20). From 
(27) we obtain 

Vk=Y. k) . (28) 

iel 

Note that the inequality ^ ^"^ (/, A:) ^ 1 yields the relations ^ s^{i,k)^\ 

kel' kel' 

and X for any finite /' cz /. Hence, ^ Vk^l. Therefore we may 

kel' kel 

approach the limit in (28) as N-^co which will result in the equality 
Vj,= Yj ^i^k^ SO that Yj = Therefore the solution Vi of system (20) 

i= I iel 

satisfies conditions (23). The theorem is thus proved. □ 

Remark. If a Markov chain is arbitrary, {xi, iel} is an absolutely sum- 
mable solution of system (20), and is a non-recurrent state, then Xj^ = 0. 

This assertion follows from the fact that one may approach to the 
limit as «-^oo in equation (25) and from the relation lim A:) = 0 

n-^ CO 

which holds for an arbitrary non-recurreni k. 

Corollaries. 

1 . In order that an irreducible Markov chain be positive-recurrent it is nec- 
essary and sufficient that system (20) admits a non-trivial absolutely summ- 
able solution {xj, iel). Moreover, Xi = cvi, where c is a constant and i^i>0. 

2. An irreducible Markov chain possesses an invariant initial distri- 
bution if and only if it is positive-recurrent. 

3. If the chain is positive-recurrent and aperiodic then the unique 
solution of system (20) satisfying (23) is of the form 

Xi = P;=lim «■). (29) 

n~* 00 

The last assertion follows from the fact that for a positive aperiodic 
chain the limits lim /) exist so that (24) yields (29). □ 
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It follows from the previous theorem that system (20) cannot admit, 
for a null-recurrent chain, a non-trivial absolutely summable solution. 
However, it possesses an important non-negative non-summable 
solution. To obtain this solution we introduce taboo- probabilities. This 
notion is a generalization of the notion of the probability of the first hit. 
It has already been encountered in the course of the proof of Theorem 14. 
The taboo-probability is the probability to hit the y-th state at 

the n-ih step starting from the /-th state without entering the /-th state 
during the times 1,2,..., and n — \. Thus 

E P(iJl)p(JiJ2)- -PUn-uAn^i ■ 

jlJl, 7n - 1, 

Clearly, 

We also set 

iP^^\i,j) = d{i,j). 

The taboo-probabilities introduced analogously. Here “the 

prohibited part” is a certain set of states H. If / and j are prohibited it 
would be logical to denote the taboo-probability by 

This is the probability that starting from the initial state i we hit the 
state j for the first time at the «-th step without entering the state /. 

We note the two following equations : 

t (30) 

k=l 

(m) (31) 

1 

Each summand in the r.h.s. of (30) represents the probability to hit the 
y-th state at the n-ih step starting from the initial state i and to hit the 
y-th state for the first time at the k-i\v step [k^n) without entering the 
state / during the first n steps. 

The l.h.s. of (30) represents the summation of these probabilities with 
respect to k. The summands in the r.h.s. of (31) have the following mean- 
ing. They equal the probability of hitting the y-th state starting from 
the initial state /, not entering the state / during this time, but entering 
the state / for the last time - before the «-th step on the k-ih step {k ^ n). In 
particular it follows from formula (31) that (for /=y) 

/"’(i,y)= E /'‘*(^0./"‘'‘*(^7)- 

k= 1 



( 32 ) 
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We introduce the following generating functions 

n = 0 



iPij ( 2 ) = E if"' ('■> j) ^ i ) = 0 • 

n=0 

The r.h.s. of equations (30) and (32) are convolutions of two sequences, 
therefore 

,P,,(z) = ,f.,(z),P,,.(z), F,,.(z) = /,.(^).f’u(^)> (33) 

Note that the series /T^^(z) are convergent for z = 1 and, moreover, if the 
states i and j communicate then ^/’^^•(1)>0. Under this assumption the 
second of the equations (33) shows that there exists a finite limit jPu{z) 
as z-»l and therefore yP,j(l)< 00 . 

Define 

00 

iG(iJ)= E (34) 

Thus if the states / and j communicate, then 

P(l) 

= (35) 

On the other hand, the first of the relations (33) yields 

Hence 



iG{i,j)^iG{jJ)<oo. (36) 

Returning to the solution of the system (20) we prove the following 
theorem. 



Theorem 16. Let I be an arbitrary state of an irreducible recurrent Markov 
chain. The system (20) admits a non-negative solution 

Xi = \, Xi = iG{ki) (i//), iel. 



Proof. Set 



Ui = \ , 



Ui = iG{U i) (///). 



We have for /// 



E “jP(V’ i)=p{l 0+E iG{l,j)p{j^ i) = 

jel ji-l 



(37) 
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00 

=P(l’i)+Y. E = 



=p(l,i)+ E iP^''^^\l,i) = iG{l,i) = Ui, 

n=l 



if however i — /, then 



X Ujp{j,i)=p(i,i)+ X E f"HU)=u,. 

je I n = 1 n= 1 

and the theorem is proved. □ 

We now consider the problem of uniqueness of the solution of system 
(20) satisfying conditions Ui=l and Wj^O. For this purpose we utilize a 
method connected with the introduction of an inverted Markov chain. 

First we shall assume that the chain is positive recurrent and let 
{vjjel} be the invariant initial distribution. 

Consider a stationary Markov process corresponding to the initial 
distribution {v^jel}. Denote by the probability measure corre- 
sponding to this process. We introduce the following conditional 
probabilities 

where t>n; we have 

i ■._Vj„PUn,jn-l)pUn-lJn-2)---P{jlJ) 

Vi 

where 



q{ij)=p(j, /)-■ 

Vi 

Thus in a stationary positive-recurrent Markov chain the conditional 
transition probabilities obtained from the change in time direction (from 
present to past) corresponds to a certain Markov chain. Moreover it 
should be noted that all Vi>0 and therefore 



q{ij)>0, X q{ij) = - E VjP(J, /)=-=!. 

Jsl Vij^i Vi 



The above construction is applicable not only for a positive but also for 
an arbitrary recurrent chain (i.e. for a null-recurrent chain as well). For 
this purpose we consider an arbitrary positive solution {xjjel} of 
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system (20) (it will be shown below that such a solution exists) and set 

q{Uj)=p{j,iy—- (38) 

As in the case above, 

jel 

A Markov chain with transition probabilities (38) is called the inversion 
of the initial chain ( the inverse chain ) . 

We note the following formulas for «-step transition probabilities 
in an inverted chain. We have 

JI 9 J2> jn - 1 

X • 

= Z p{jui)p{j2jl)---p(jjn-l)—. 



i.e. 



^<”>(/,7) = ^/7^">(y,0. (39) 

The following corollary follows from the above : 

If the initial chain is irreducible, recurrent, positive or null, then the 
inverted chain possesses the same properties. 

From the limit theorem for ratios we obtain in the case of a recurrent 
chain 

Z 

lim"^ = 1 . 

Z q^"VJ) 

n = 0 



Using formulas (39) we have 



Z 0 



lim 



Z p'"\JJ) 

n = 0 



(40) 



The following theorem is a corollary of the relationship obtained : 
Theorem 17. For an irreducible recurrent chain a non-negative solution 




§5. Markov Chains with a Countable Number of States 



115 



of system (20) such that x, = 1 is unique. Moreover Xj = ^G(/, /) and 

i i) 

=jGU,i)- ( 41 ) 

n = 0 

Formula (41) follows from the uniqueness of the solution of system 
(20) and Theorem 16, while the uniqueness follows from formula (40) 
and the assumption that x^>0 for all J. Therefore in view of Theorem 16, 
it is sufficient to show that if {xj,JeI} is a non-negative non-trivial 
solution of system (20), then x^ >0. This can be obtained as follows. 
We have for a non-negative solution of system (20) 

^i=Z ^jpU’ 0=Z Z ^kP{kJ)p{j’ 0= 

j ^ j k 

=Z Y.pi^’j)p{j^ 0=Z XkP^^\k, i). 

k j k 

Using induction one can easily obtain that x^= X 0- Let x^>0; 

kel 

one can find n such that for any i, p^”^(/, i)>0 and therefore 
x,^x^p^"^(/, i)>0. Constructing an inverted chain for the given solution 
of system (20) and putting = 1 we obtain from (40) the uniqueness of 
Xj, iel. In view of Theorem 16, Xj = ^G(/, i). 

Remark. Formula (40) is a generalization of the relationship lim x 

N 

Z = (valid for a positive recurrent chain) where {dJ is the in- 

n=l 

variant initial distribution. 

The following theorem is a refinement of Theorem 17. 

Theorem 18. For an irreducible non-recurrent Markov chain the system 
of inequalities 

x^^O, Xf-=1, (42) 

j^i 

admits a unique solution and moreover x^ = Z ^jP{j\ 0’ 



In view of Theorem 16 it is sufficient to prove the uniqueness of the 

solution for system (42). We introduce the inverted Markov chain with 

u ■ 

transition probabilities q{ij)=p{j, i) — where Wj is the positive solution 

Ui 

of system (20). This chain is irreducible and recurrent. We have 
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But in view of Theorem 14 the system of inequalities 

yi=^ 

j 

admits the unique non-negative solution = 1. Hence, Xi = Ui for all iel. 
The theorem is thus proved. □ 



§6. Random Walks on a Lattice 

s 

Irreducibility. Definition 1. A set Z of vectors z= ^ where {i = 

i= 1 

= 1 , . . . , .s) are linearly independent vectors in and - integers 
(ai = 0, ± 1 , + 2, . . .) is called a lattice. 

Clearly, Z is the minimal additive group containing vectors 
Cl , ^2 5 • • • ? • The number s is the dimension of the lattice and the vectors 

Cl, C 2 , ...,^5 are its basis. If s<m the lattice is called degenerate; for 
s = m the lattice is non-degenerate. 

A random walk {C («), « = 0, 1 , 2, . . . } on the lattice Z is defined by the 
formula C(«) = x + (^i + ... 1, C(0) = x, where x is a non-random 

vector being the initial position of the random walk, xeZ and c^i, ^25 * ? 

. . . , (^„ . . . are identically distributed independent random vectors with the 
values in Z. Put /7(x) = P = xeZ. If Xj,eZ {k = 0, 1, ..., ^), we ob- 
tain from the definition of random walks 

p{c(o)=xo,ai)=^i.-,a«)=^j= 

n 

= 5{xo-x) n p{Xk-x,,-i). 

k= 1 

Therefore a random walk on a lattice is a particular case of a homo- 
geneous Markov chain with a countable number of states and with one- 
step transition probabilities satisfying p(x, y)=p{y — x). The basic char- 
acteristic feature of random walks which distinguish them from the gen- 
eral Markov chains with a countable number of states is the spatial 
homogeneity of transition probabilities : 

p{x + z,y + z)=p{x,y)=p{y~x). 

This property is just another expression of independence of the displace- 
ment vector of the walk (^„+i =C(«+ 1) — C(«) from its position at the 
given moment. Evidently the spatial homogeneity holds also for the 
«-step transition probabilities : 

/J<"*(x + Z, j + z)=P{C(n)=jp + z I ^(0) = x + z} = 

= P {C (n) - c (0) - X I C (0) = x} ( J - , 
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where W = P ^ 2 + • • • + is the probability that the sum of 

n independent identically distributed random vectors takes on value x. 

It follows from the spatial homogeneity of the walk that the set 
of all points Z accessible from a given point x can be represented in the 
form Kq + x, where Kq is the set of points accessible from 0 (OgA^o). To 
describe set Kq we introduce set D consisting of all those xgZ for which 
p(x)>0. The set D is called the support of the distribution of random 
vectors 4- Only points belonging to D are accessible in one step from 0. 
Only those points of Z which admit representation x = x^+X 2 where 
x^eD are accessible in two steps from 0. Let //+ denote the collection 
of all points in of the form x = n^x^ + ... -\-n^x^, where and rij, 
[k=\, .... s) are arbitrary positive integers and x^eD. Clearly, 

Ko = H^ 

i.e. H+ is the set of all points accessible from 0. 

Two points X and y belonging to Z are termed communicating if 
x—yeH+ andy — XG//+. Set 

The decomposition into classes of communicating states described in 
Section 5 in the present case, consists of the following: is the class 

of states containing point zero, all the other classes of communicating 
states are of the form Hj^ = Xk~\-H^, where x^ is an arbitrary sequence 
belonging to Z such that x^ — Xj^H^ (^ 7 ^ 7 )- 

It follows from the spatial homogeneity of random walks that differ- 
ent classes of communicative states are either all essential or all non- 
essential, so that the property of being essential or non-essential is re- 
lated to random walks as a whole. 

The condition of being essential is equivalent to the requirement that 
which means that is a group. 

Therefore in order that the states of a random walk be essential it is 
necessary and sufficient that subset of points Z be a group. 

We introduce the set H of points z which can be represented as 
z = X — y, where x, yeH + . This set is the minimal group containing points 
zgZ accessible from zero. It will be shown that H is di lattice in (pos- 
sibly of a lower dimension). It thus follows that when studying random 
walks one may assume that H coincides with the lattice Z of all vectors 
with integer-valued coordinates in the space Denote by Z"* the lattice 
of all integer- valued vectors in space 

The following theorem is purely algebraic in nature and can be for- 
mulated as 

Theorem 1. The r-dimensional additive group H of vectors in a 

linear space is an r-dimensional lattice. 
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Proof. Let r be the maximal number of linearly independent vectors in 
H. We shall prove that there exists r linearly independent vectors x^eH 
such that H coincides with the set of vectors of the form a 2-^2 + • •• 

-h a^x^ where are arbitrary integers {aj^ = 0 , ± 1 , + 2, . . . , /c = 1 , 2, . . . r). 
Let xf, X*, X* be an arbitrary maximal system of linearly indepen- 
dent vectors in H. Then each vector xg// can be represented in the form 

Z hx*, (1) 

k= 1 

m 

where are real numbers. On the other hand, x*= ^ where Cf^j 

j=i 

are integers and the rank of the matrix equals r. Representation ( 1 ) 

r 

is equivalent to the system of linear equations ^ bj^c^j = 7 = 1 , 2, . . . , m, 

k= 1 

where Oi are the integervalued coordinates of x in the basis {ej,,k = 
= l,...,m}. It thus follows that for 0 ^bj,<\ there may exist only a 
finite number of vectors of form (1) with bj, being rational numbers. 
Therefore if B is the least common denominator of all bj^, then represen- 
tation (1) can be written in the form 

r V* 

Z ^k 

Ckyk^ yk=—. 

k=i ^ 

where are integers. Consider now an arbitrary linear transformation 

r 

^k= Yj ^kjyj = with integer-valued coordinates and the deter- 

minant 







nil • 




V{zi, Z2,.. 


•,Zr) = 


nil. 


. Yl2r 






n,i .. 


. n,. 



consisting of coordinates of the vectors z^, . . ., in the basis {yjj= 1, . . r}. 
The determinant will be integer-valued (different from zero) if and only 
if the system of vectors z^,..., z^ is linearly independent. We choose 
a system of vectors z^,..., z^ so that Zj^eH and the determinant (2) will 
be of the smallest positive value. Such a system exists. We denote the 
corresponding vectors by /i,..., /^. Suppose that for some xeH in the 

r 

expansion x= ^ dj^. not all are integers, then there exists a vector 

fc= 1 
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I'eH such that = X d'klk, 0^d'k<i, dj>0 for some j. Moreover 

~ • • •? 1 ? ^j+ 1 ? • • -5 0 ~ 

which contradicts the fact that the determinant Q attains the 

minimal value. 

Therefore the lattice for which the system of vectors {/i,..., /J is 
the basis coincides with H. The theorem is thus proved. □ 

Definition 2. A random walk on an integer-valued lattice Z"” is called 
irreducible if H=Z"^ and is called reducible if 

Note that the notion of irreducibility of a random walk just intro- 
duced bears no relation to the definition of irreducibility of a Markov 
chain. 

The preceding theorem shows that by means of an affine transforma- 
tion of the space one can always assure that the m-dimensional random 
walk will be m-dimensional irreducible. The following criterion of irre- 
ducibility of a random walk can be given in terms of characteristic 
functions. Let 

xeZ"* 

be the characteristic function of the vector =C(1) — C(0) representing 
one step in the random walk. 

Theorem 2. In order that a random walk be irreducible it is necessary and 
sufficient that J(u)^ 1 for uj^lnx, xgZ"*. 

Proof. Sufficiency. Let the walk be reducible. If the dimension of H is 
less than m, there exists a vector e orthogonal to H, such that {ce, c^i) = 0 
with probability 1 for any c and the condition of the theorem is not 
satisfied. Assume now that the dimension of Lf is m. Choose in H the 
basis f,...,l^ and let T be a linear transformation changing the basis 
{Ck, k= 1, ..., m} to {4, k= 1, ..., m}, lk=TCk. The matrix of the trans- 
formation T in the basis /:= 1, ..., m} consists of the coordinates of 
the vectors 4 and has therefore integer-valued entries. Its determinant 
however does not equal ± 1 . Indeed, if it were ± 1 , the inverse matrix 
also would have had integer- valued entries and each point in Z 
would have been a point in H which contradicts the fact that the walk 
is reducible. Consider the set Z' of all vectors such that r*DeZ'”, 
where T* is the conjugate of T. Clearly Z' is an additive group and 
Z^ciZ' . On the other hand, Z' ^Z"^ otherwise the integer- valued trans- 
formation T* would have had an integer-valued inverse which contra- 
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diets the relations 

l=Det(r*r*-^)=:Det(r) Det(r*“^), 

since Det(r)# + 1. Therefore there exists a vector v such that veZ', 
and T*veZ"^. Hence the number {v, /^) ={v, Te^ = {T'^v, e^) is an 
integer for any k and therefore (r, is an integer with probability 1, so 
that y(27rp)= E exp {27r/(r, ^i)} = 1 for v^Z^. 

We have thus proved the sufficiency part of the theorem. 

Necessity. Let the walk be irreducible and let J(27rr)=l. Then 
E[1 —exp (2711 (p, <^i)}] = 0, which is possible only if {v, is an integer 
with probability 1. From the irreducibility of the walk it follows that 
(p, 4) is an integer {k= 1, m) i.e. veZ^. This completes the proof. □ 

Recurrent walks. Let be the probability that a random walk 

with the initial state x passes the state y for the first time at the instant 

00 

5, and let F(x, j)= X y). 

s= 1 

From relation (4) of Section 5 we have 

/<") (x, y) =/><"' {x, y)~Y. y) y) 

s= 1 

and it follows from the spatial homogeneity of the random walk that 
/*"> {x, y) =/<"> (0, j - x) =/<"> (j -x) ; 
also the previous equation can be rewritten as 

/<'’>(x)=y"'W- E 

s= 1 

Moreover the function F(x, y) depends only on the difference — and 
we may put F{y,y-\-x) = F{x). In particular, F(x, x) = F(0, 0), so that the 
states of the random walk are either all recurrent or all non-recurrent. 
Therefore we shall henceforth refer to recurrent or non-recurrent ran- 
dom walks. 

Let 



00 00 

Fx{^)= Z P"^(x)z", Px{z)= E 

n= I n=0 

The functions F^{z) and P^(z) are related as follows (cf. Section 5, (6)). 
f’o(z) = (l-fo(z))“\ Px(^) = Po{z) F^{z), (x#0), 



We thus obtain the following assertion: 
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for a recurrent random walk it is necessary and sujficient that 

00 

G(0)= X y"'(0)=®. 

Recall that in view of the results presented in Section 5 if a walk is 
recurrent then the return to the initial state occurs with probability 1 in- 
finitely often during an infinite interval of time. Relation (10) of Section 5 
becomes 



I 

i^(x)= lim 

N ->■ CO \ / V 

E 

n = 0 



(4) 



Therefore if state x is accessible from zero and the walk is recurrent, then 
G (x) = 00 , where 



G{x)= E 



n = 0 



If, however, the walk is not recurrent, then 

G(x)^G(0)<oo. 



The function G(x) has the following probabilistic meaning. It equals the 
mean value (mathematical expectation) of the number of “visits” by a 
random walk which started at point 0 to state x during the interval (0, oo). 
In the case of a recurrent walk G(x) is either 0 or oo. In the non-recurrent 
case G(x) is called the Green function of a random walk. 

The following criterion for non-recurrency of a random walk is a 
simple corollary of the strong law of large numbers. 

Let the step of a random walk possess a finite mathematical expectation 
dijferent from zero. Then the walk is non-recurrent . 

Indeed, with probability 1 

lim =^ 7 ^ 0 . 

00 n 



Therefore, for almost all co, nQ = nQ{o}) can be found such that \C(n)\> 
\m\ 

>— n for n^nQ, so that the return to the point 0 is impossible starting 



from the moment . 

One can obtain a number of other criteria of recurrency and non- 
recurrency by using the characteristic function J(u) of the step in a ran- 
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dom walk. It is easy to see that 

G(0)=limp^jRe(l-rJ(M))-irfu, (5) 

c 

where m is the dimension of a lattice, C is a cube in C = {u:\u"\<n, 
m}, 0<t<l. Indeed, first of all 

00 

G(0) = lim Y^p^{0)t\ 

ftl n = 0 

On the other hand, expression (3) for the characteristic function of a 
random walk shows that p{x) are the Fourier coefficients of the Fourier- 
series expansion of J{u). Therefore 




c 



and 




( 6 ) 



SO that for 0<t<l 



'du. 

c 

Since Pq (t) is real, one can replace the integrand in the last integral by 
its real part. Approaching the limit as t-^1 we obtain formula (5). Putting 
J^{u) = RqJ{u) we can rewrite formula (5) as 

If l-tJM , 

c 

Utilizing this formula one may obtain a number of special criteria of 
recurrency. For example we shall now prove that a one-dimensional 
random walk for which m — E^i =0 is recurrent. 

Indeed 






►m = 0 as u^O. 



Therefore for any s>0 a (5>0 can be found such that |1 — J(w)| <8|m| 
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for 1^1 <^. Therefore 






s 



1 f l-t 
"" 2n ^ (1 — ty + 



du = \im - — arctg 
t^i2n s 



ed 

Y^t 



0 



1 



hence G(0) = oo as claimed. 

To obtain analogous results for multidimensional walks certain 
bounds on characteristic functions are required. 

Lemma 1. For a non-recurrent random walk of dimension m^2 there ex- 
ists a constant k such that 1 —Jc{u)^k\u\^ for ueC, where C is the cube, 

C={u\ max Iw'Kti}. 

1 

Proof Since 

1—J^{u)= Y, [1 — cos(m, x)] p(x) 

jceZ"* 



and 



1 — cos(w, x) = 2 sim 



(u, x) 



-> 2 (- 
2 \n 2 



=-Au,xr 



for \{u, x)| <7T, we have 

1 - X!' («» pW’ 

where Y' denotes the summation over all xgZ"” satisfying the condition 
\{u, x)l ^ 71. Since a walk is irreducible one can choose in the set {x:p(x)> 0} 
a basis of Let ..., be the vectors of this basis and let 
N = max{|^fc|, /c=l, 2, ..., m}. Moreover let \u\ ^nN~^ then 

2 ^ 

X {u,ekYp{e^). 

^ k=l 

The quadratic form appearing in the r.h.s. of the inequality is positive 

m 

definite. Therefore there exists a constant k^ such that Y P{^k)^ 

k=l 

^/ci \u\^. Thus 

2 

l-J^{u)'^-k^\u\^ for \u\^7iN 

n 
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In view of Theorem 2 in Section 1, J (pi) is different from the one in the 
region Ci = C\{u:\u\<nN~^}, and therefore min[l — J^(w)] = /c2>0. But 

ue Cl 

then l—J^{u)^k2N^7i~^\u\^ for \u\'^nN~^, ueC, which proves the 
assertion. 

We now return to the problem of recurrent random walks. Assume 
that the two-dimensional random walk is irreducible, and 

E|(^i|^ X 00 . In view of Patou’s lemma 



lim 

tn 



l — tJc{u) 

J |l-tJ(u)|^ 







du. 



( 7 ) 



It follows from Lemma 1 that 1—J^{u)^k\u\^, ueC. On the other hand, 
since possesses finite second moments. 



Hence \1—J{u)\^ki\u\^ in some neighbourhood of the point u = 0. 
Therefore the integrand in the r.h.s. of inequality (7) is not less than 
k\u\^ 1 

integral diverges. Therefore the random walk under consideration is 
recurrent. As far as the random walk of dimension ^ 3 is concerned it is 
always non-recurrent. 

Indeed, since 



2 in some neighbourhood of point w = 0 and the corresponding 



/ = lim 

tn J 

c 



1—tJJu) ^ C du C du 

—^du^lim — , 

\^-tJ{u)\^ tn J l-tJM J 1-JcM 



using Lemma 1, we have 






du 

k\u\^ ’ 



c 



and the last integral converges if the dimension of the space is m^3. 
The results obtained can be summarized as follows: 



Theorem 3. A random walk of dimension m^3 is always non-recurrent. It 
is also non-recurrent if there exists and E^^^O. If E(^i=0 then the 
walk is recurrent in the one dimensional case. If in addition to E^i=0 the 
condition E\^^\^<co is also satisfied then the random walk is recurrent in 
the two-dimensional case. 




§7. Local Limit Theorems for Lattice Walks 



125 



§7. Local Limit Theorems for Lattice Walks 

The asymptotic behavior as n-^oo of the probability of hitting the 

lattice point x during n steps of a random walk is studied in the present 
section. Analytically the problem consists of investigating the asymptotic 
behavior of the integral (cf. Section 6 (6)) 

= (1) 

c 

as 00. We shall consider only irreducible walks. Moreover it is assumed 
that the walk possesses the property of complete irreducibility as defined 
below. 

Definition 1. An irreducible walk is called completely irreducible if for 
any point XqeD the random walk with the step (of the walk) — Xq 

is also irreducible. 

Utilizing Theorem 2 of Section 6 it is easy to formulate a criterion of 
complete irreducibility of a random walk. 

Theorem 1. In order that a walk be completely irreducible it is necessary 
and sufficient that 

|/(w)|#l, if u^lnx, xgZ"". 

Proof. Let the walk be completely irreducible and let J{u) = e'\ where t 
is a real number. It follows from the equalities 

xeZ^ xeZ^ 

that (x, u) = t-\-2nn, n = n{x) for each x such that /?(x)>0(xgD). Let 
XqeD, then 

1 _ ^ /?(x) = 

xeZ^ 

X e Z"* X 6 Z”’ 

where ^(x) is the distribution of a random vector rj^=^^—XQ.lt follows 
from the irreducibility of the walk with the step and Theorem 2 of 
Section 6 that the last relationship can hold only if u = 2nx, xeZ^. On 
the other hand let J{uQ) = e'^ for Uq^2tix (xeZ"*). It follows from the 
above that the walk with the distribution of the first step ^(x) will be 
reducible. The theorem is thus proved. □ 

We now proceed to bound integral (1). Firstly, transforming it using 
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the transformation 



x = na-{-y/n x„, a=E(^i, xeZ^ 



we get 




where C denotes the cube {u:\u-^\<.^ 7iJ=l, 2,..., m}. We shall as- 
sume that possesses finite moments of orders r-h2, r>0. Expanding 
^i(u,x) jjjeans of Taylor’s formula we obtain 

J (w) = 1 + iA 1 (w) -f fA2 + ^A ^ + 2 M ^ (l^^r ^ % (3) 

where Aj^{u) is a homogeneous form in u of order k and 

Ai(u)=E(u, ^i), A 2 (u)=^E{u,^if. 

Hence it follows that in a neighbourhood of the point u = 0 one can 
define a single-valued continuous function In J(m) which satisfies 

In J (u) = iSi (u) -h i^S 2 (w) + . . . + f + 2 M ^ (I ^ • 

Here Sj^{u) are also homogeneous forms in u of order /c, 

Si(u) = Ai(u)=E{u, (^i) = (m, a), 

S 2 (u)=jV(u, ^i) = i(Bu, u), 

where B is the variance-covariance matrix of vector Thus putting 
Ji(u) = J{u) we obtain 




where 



r ^ 



h= Z --=Z^k + 2(u) + 0 



Since 



Z -ii+o(C^) 

/ln^'"^^n\ 

and =0 — - ■ - I in the region \u\^k \nn, it follows that 
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1 / ^ 6?r(“)\ 

...+ Uu)+^^] + 0 



^r + 1 






( 4 ) 



where 2* (w) is a polynomial in w of a fixed degree whose coefficients 
remain bounded as n^co and lj{u) are polynomials in u of degree 3j. 

Denote by the operation of the /c-time partial differentiation with 
respect to Note that relation (3) may be differentiated at least r + 2 
times in the sense that 

. . . D):J(tt)= + o(|Mr+^-P), 



where p = k^+ ... + /c^. From here it follows that a similar assertion holds 
for the function In J{u) as well. Using this fact it can be shown that relation 
(4) can be differentiated term by term with respect to u so that 



£)<'’» J"i ( -*^ ) = e ■ ^ 1 + — (u) +.. . 



• • • + — = (m) + — =z I + 0 1 



•Jn' 



+ 1 



/ln^'"^^n\ 



( 5 ) 



where 



= /C1+/C2+ ... +/c,^r + 2 . 

We now proceed to bound the integral 




du. 



For this purpose certain inequalities for the function J{u) will be required. 
From the facts that the walk is irreducible and that moments of the second 
order are finite (cf. Section 6) the existence of a (5 > 0 follows such that for 
\u\<S 

\J(u)\ < 1 w), 

and since the distribution of is non-degenerate, \J{u)\<l — c\u\^ < 
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for \u\<d, where c is a constant. On the other hand it follows 
from the complete irreducibility of the walk that \J{u)\ < 1 — ^ for \u\'^d, 
ueC, where 0<^< 1. These two bounds for J{u) taken together yield 

+ 1 — ueC. (6) 

The same bound is also applicable to {u). 

We now return to integral L We have 

+/2 + ^ 3 ? 

where 




h 



I 



j)iP) ^ n-^‘\{u) 

k=0 



du 



and 

B„ = {u:\u\^^k Inn} n(>yn C), 
B* = {u:\u\>y/k \nn}n (V^c). 
It follows from (5) that 



MiV of— 

Taking into account the uniform boundedness of the coefficients of the 
polynomial Q* (x) regarded as a function of n and the convergence of the 

integrals f P(u) du for any polynomial P(u), we obtain 







DP 






Bn 



h = 0 + 0 0(^/i^r = o{n-''^) . 

Furthermore, 



/,= 






Bt, 



du = n”'l^ 






|D<^’V"i(u)|du. 
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The expression J” (u) is a polynomial in (u) and its partial deriv- 
atives ; moreover (u) enters into this polynomial in the n — p-ih power or 
higher, while partial derivatives of (u) are of order p ^ r + 2 or lower and 
later in powers p or lower, and finally the coefficients of this polynomial 
are of order n^. Therefore 

\D^P^r,{u)\^AnP , 

where A is independent of n. Hence it follows from (6) that 



__ ill n 

l2^An ^(l+e(e " -1))"“'’ du^ 



P+j cfc Inn / ^ 

^(2n)'"i^An " - 1)=0 



n 



Qck — p — 7 



Therefore a constant k can be chosen independently of n such that 

1 



1 2^0 






It remains to bound the integral We have 






k(Bu,u) 



\Qmp{u)\ du^e 



— ^ck 






-ic|M|2 



\Q„rp(u)\ du, 



Bp, 






where Q^rp is a polynomial whose degree depends only on r and p while 
the coefficients depend also on n, but remain bounded as n^oo. Again, 
it follows from the convergence of the integral in the r.h.s. of the last 
inequality that a k can be chosen such that 



/a — o 



W2 



Thus we have proved that for a suitable choice of k 






( 7 ) 



Consider now integral 



L = 



^-Hu,x)pip) ^ n-'^'\{u)du. 

k = 0 







130 



Chapter II. Random Sequences 



For r = 0 and /? = 0 this integral is the Fourier transform of the character- 
istic function of an m-dimensional normal distribution. Therefore 



01 ^ 

where \B\ is the determinant of matrix B and J5“Ms its inverse. In this 
formula it is permissible to differentiate with respect to x an arbitrary 
number of times and moreover in the l.h.s. of this equality this differentia- 
tion may be carried out under the sign of the integral. Therefore we have 
for an arbitrary polynomial P{u) 

[ P{u) du = {2Kp^ Q{x), 



where Q(x) is a polynomial in x of the same degree as polynomial P. Next, 
using integration by parts we obtain 

01 ^ 









provided = 



d" 



Thus 



(du^P(duy\..{du"‘f" 

L = (ix^P(ix^f\..{ix"'f"'{2n)'"'^ g- 4 (B-‘x,x) ^ 6^^ 



k=0 ^ 



where the polynomials Q{x) are of the same degree as polynomials 
4(w), i.e. of degree 3k and Qq{x)=1. 

The proof of the following theorem is now almost completed: 

Theorem 2. Let l^{n) be a completely irreducible walk, CW = ^i + ^ 2 + ••• 
+ 4? y^here 4 ^re mutually independent and identically distributed vectors 
with values in Z"* with finite moments of orders r + 2, r^O. 

Then 



{2nr^ 




Qki^^n) 



+e«, 
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where 



uniformly in x, and a is the vector of mean values of the step of the random 
walk, and B is the correlation matrix. 

Indeed, we have 

{2n (x) = 

fn' 



Vile 



where 



— i{u, Xn) 



Dip) g-HBu,u) ^ 

L k=o n'‘^^ _ 



du + 74 + /5 , 



1 jyip) 



VUc 



h=- 



y *=o " - 



du. 



^-i{u,Xn) Qip) 



-UBu,u) y 



du. 



^WnC 



However it follows from inequality (7) that 



1 






Moreover, 




^"*\V n C 




in 



du, 






where R{u) is a polynomial; therefore \l 5 \ = 0{g'') = o(n ^){g = e for 
any k. Thus we have for an arbitrary polynomial of degree at most r + 2: 

{2nypif'P(x„) p<"'(x„) = 

It is clear that the second summand in the r.h.s. of the last formula 

m 

depends on the choice of P(x). Set ^ Taking 

j=i 
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into account that P(x) may be replaced by a function which takes on 

m 

smaller values and that we obtain the required 

j=i 



result. □ 



§ 8. Ergodic Theorems 

Measure-preserving transformations. Definition 1. A random process 
{(J (/), teT} with values in a measurable space ©} is called stationary 
if for any n, t^, t 2 ,..., t„ and t such that tj^-\-teT{k=\, n), the joint 
distribution in of the sequence 

^{t2 + t\..., i^{tn + t) 

does not depend on t. 

This definition of a stationary process is equivalent to the following: 
for an arbitrary bounded ©"-measurable function /(xj, 
the mathematical expectation 

+0’ ^(^2 + 0’ •••’ ^(^n + 0) 

is independent of t for any choice o^n, t^,..., tn{t^-\-te T). It follows from 
here that if x„) is a measurable mapping (£}, 

then r][t) = h{^[t^-\-t\,.., + is a stationary process on the set of 

values t for which r][t) is defined. 

In the present section we shall consider stationary sequences, i.e. sta- 
tionary processes defined on the set {/ : / = 0 , ± 1 , ± 2, . . .} with values 
in ©}. 

Let be the space of all sequences w — {..., x_„, x_„+i,..., Xq, 
Xi, ..., x„, ...}, £ be the minimal (7-algebra containing all the cylinders 
in be the measure induced on £ by the sequence (^), teT}. Thus 

the probability space £, P^} is a natural representation of the pro- 
cess {^{t\ teT). Denote by S, PJ the space with the completed 
measure. Introduce in the shift operation S\u' = Su, if x^ = x„+i, 
neT where u — {x^,neT},u' = {x'„,neT}. The operation S possesses an 
inverse S~^ and moreover if u" = S~^u, then w" = {x", «gT}, x" = x„_i. 
The condition of stationarity of the sequence ^{t) means that for an 
arbitrary cylinder C 

P,(C)=P,(5C). (1) 

Since a measure on cylinders uniquely determines a measure on (£ and 
on its completion £, the equality remains valid for an arbitrary 

P^{A)=P^{SA), Ae&. (2) 
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Definition 2. Let g, jj] be a measure space, S be a measurable trans- 
formation of 5} into 5}. The transformation S is called measure- 

preserving if for any Ae^ 

n{S-^A) = ix{A), 

where S~^ A is the complete pre-image of the set A. 

A transformation S is called invertable if there exists a measurable 
transformation such that SS~^ =S~^S = I, where I is the identity 
transformation. In this case the transformation 5 “Ms called the inverse 
transformation of S. The definition of a stationary sequence is equivalent 
to the following: a sequence teT] is stationary if the shift operator 
S in preserves the measure P^. 

Therefore the problem of studying stationary sequences is a particular 
case of the problem of studying measure-preserving invertable transfor- 
mations (automorphisms) of a certain measure space. Consider the 
problem of the asymptotic behavior of the mean 

1 

- E /(‘S'*‘w)> n-^co, (3) 

n k = 0 



where is the k-ih power of transformation S, f{u) is an arbitrary 
g-measurable function, g, p) is a space with measure p and ju(^)^oo. 
To clarify the meaning of the problem consider the case when {^, g, p} 
coincides with £, P J and S is the shift operator. Let = i w) = , 

f{u) = XB{^o% where Xb{^) is the indicator of the set Then 



and 



f(S’‘u) = XB{S’‘u) = XB{m) 



1 

n 



n- 1 



z 






u) 

n 



( 4 ) 



where v„ {B, u) is the number of terms in the sequence ^ (0), ^ (1), . . . , — 1) 

whose values fall into the set B, i.e. v„(^, u) is the frequency of hitting 
the set B by the first n terms of the sequence ^(t) (t = 0, 1, ..., n — 1). There- 
fore the problem under consideration is a particular case of a problem 
concerning the behavior of the frequency with which values of a random 
variable ^ {t) fall into an arbitrary set B. Firstly, we shall prove that there 
exists with probability 1 the limit as n-^oo of the mean value given in 
(3). This proposition is the well-known Birkhoff-Khinchin theorem. 

Lemma 1. If S preserves measure p, and f{u) is an "^-measurable 
non-negative [p-integrable function), then 

J f{Su)n(du) = jf(u)fi {du) . 

S~ D 



( 5 ) 
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Proof. If we put /(w) = /^(w) formula (5) becomes the equality 
^{S~^{AnD)) = fi{AnD\ which is valid for any A and Deg. It follows 
from here that formula (5) is valid for arbitrary g-i^easurable non- 
negative and /i-integrable functions. □ 

The following lemma is of an elementary arithmetic nature. Let 
^ 2 , be a sequence of real numbers, and p be an integer. We 
refer to the term of the sequence as p-marked * if in the sequence of 
sums 



•••? ^fc + ^k+1 + ••• + <^k+p-l 

at least one sum is non-negative (% is 1 -marked iff it is non-negative). 

Lemma 2. The sum of all p-marked elements is non-negative. 

Proof. Let aj,^ be a p-marked element of the sequence with the lowest 
index and let + be the non-negative sum 

with the smallest number of summands. For h<r, + i + ... + 

+ hence + ... i.e. all the terms of the se- 
quence + ..., are p-marked and their sum is non-negative. 

These considerations may be applied to the sequence starting with its 
^ki + r+ i“tb term. Thus the whole sequence is subdivided into parts where 
each part ends with a group of p-marked terms and the sum of the p-mark- 
ed elements in each part is non-negative. The set of p-marked elements in 
the whole sequence coincides with the union of sets of the p-marked ele- 
ments contained in its parts, which completes the proof of the lemma. □ 
The following lemma is the basic step in the proof of Birkhoff- 
Khinchin’s theorem. 

Lemma 3. Let f{u) be a p-integrable function, S be a measurable trans- 
formation of g} into {"^,5} preserving the measure p and let 



E= U 

n= 1 



k= 1 



Then 



j f{u) p(du)^0. (6) 

E 

Proof. Consider the sequence f{u),f (Su),...,f{S^'^^~^u) and denote by 
s(u) the sum of all p-marked elements of this sequence. In view of Lemma 2 
s(u ) ^ 0. Let Dfc = {u:f (S^u) be a p-marked element}, Xk{^) be the indicator 



or p-non-negative using Loeve’s terminology. Translator’s Remark. 
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of the set Du. Note that 



Do = <u:sup Y. and = for k^N. 

i n^p k=l J 

Hence Dj^ = S~^DQ{k^N). We thus have: 
r rN+p-i 

0< s(M)|i(rfu)= ^ f {S'‘u) i^(u) fi(du) = 

J J fe=0 

N+p-1 r 

= Z /(S*M)//(<iu). 

k=0 J 






In view of Lemma 1 



/ jj.{du) 



/(S*m) / i(rfu) = 



f{u)fi{du), k^N. 



Dk 

Consequently, 



s-^Do 



Do 



N 



r N+p-i r 

f{u)fi{du)+ ^ f{S'‘u)fi{du)^0. 

J k=N+ 1 J 



(7) 



no 



Dk 



Since 



/ (S^m) fi{du) 



< 



J 

Dh 



|/(S*‘U)| /l(dM) = 



\f(u)\ n(du) < 00 , 



dividing inequality (7) by AT and approaching with N to oo, we obtain 



f{u)fi{du)^0. 



( 8 ) 



Do 

The sets Dq = Dq(p) (/?=1, 2, ...) form a monotonically increasing se- 
quence and 



lim Dq(p)= U Dq{p) = E_ 

p-> 00 P— ^ 

Approaching to the limit in (8) as p->oo, we obtain (6). □ 

Lemma 4. (Maximal ergodic theorem). If f{u) is p-integrable, X is a real 
number and 

00 r 1 n 'j 

£a=U W-- Z 

n=l I ^ k=l J 
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then 






( 9 ) 



Applying Lemma (3) to the function f{u) — l we obtain the proof 
of the theorem. 

Theorem 1. (The Birkhoff-Khinchin theorem). Let be a mea- 
sure space, S be a measurable mapping of 5} 5} preserving 

measure p and f[u) be an arbitrary p-integrable function. Then there ex- 
ists the limit 



lim f(S'‘u)=r{u), 

00 ^ k = 0 

fi-almost everywhere in the function f*(u) is S-invariant, i.e. 

f*(Su)=f*{u){modn), 

the function f*{u) is integrable and, if g{'^)<oo, then 




f{u)g{du). 



( 10 ) 

( 11 ) 



( 12 ) 



Proof. We may assume without loss of generality that / (m) is finite and 
non-negative. 

Set 



0 * (u) = lim - X / (m) = lim - ^ • 

^ k=0 ^ fc = 0 

It is required to show that g"^{u) = g,^{u) (mod/i). Let 

K^D = W:g*{u)>P,g„{u)<a}, 0<a<^. 

It is sufficient to show that p(K^p) =0, for all a, P, (a, PeR). (Indeed 
{u:g*{u)>g^{u)}^ U where R is the set of non-negative rational 

a<p 

oc,PeR 

numbers.) Note that 

0*(Su)=lim|^^ Z f{S'^u)LM\=g*(u) 

(I n + I k=o n ) 

and analogously g:^{Su) = g^{u). This means in particular that = 

= K^p. Therefore Lemma 4 is applicable to the space with measure 
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{K^p, fi}. It thus follows that 



f{u)iJ,{du)-^P}i{K^p). 



(13) 



Applying Lemma 4 to the function — / (w) we obtain 




)^a^{K^p). 



(14) 



Since j5>0, it follows from (13) that fi{K^p)<co but then inequality (14) 
may hold only if fi{K^p) = 0. Thus the existence (mod/i) of limit (10) is 
verified. Set /*(m) = ^*(u). Then equation (10) is satisfied, and function 
/*(w) is 5-invariant everywhere on 



In order to prove formula (12) we set Aj^^ = 






k+1 



00 

We have 

k— — oo 



s-M, = 



u:-^f*{Su)< 



k+1 



= Akn‘ We now 



apply Lemma 4 to set A^^. For any a>0 we obtain J f{u)ji{du)> 

Akn 

>1^^ — 8^ l^(Akn)- Now as 8^0 we have the inequality J f{u) fi(du)^ 

Akn 

Analogously, J f(u) fi{du)^’^ n{A^„), and thus 



Akn 



J 

Akn 



1 



f{u)fi{du)~ J /*(w)/i(^/u) 

A-kn 

Summing up these inequalities over all k, we have 






f{u)fi{du)- 



f*{u)fi{du) 






Taking into account that n can be chosen arbitrarily in the case when 
in{%)<oo, we obtain formula (12). The theorem is thus proved. □ 

Some corollaries of Birkhoff-Khinchin’s theorem. 

Corollary 1. Let ti{^)<oo, f{u)e^p{^, g, ju}. 
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I n-l p 

- Z f{S'‘u)-f*(u) n(du)^0 as n-^co. 

^ k=0 



To prove this statement we consider an arbitrary bounded function 
/o(w) and let \\f{u)—fo{u)\\p = S, where ||/||p is the norm of the element/ 
in 5, //}. Then 

- E /(S'‘w) -/*(«) < - Z [/(S*'w)-/o(S'‘m)] + 

^ k=0 p ^ k = 0 p 



+ - Z /o(^'‘m)-/o(m) + II /o(w) -/*(«) lip- 

^ k=0 p 



In view of Jensen’s inequality and Lemma 1 

-"Z C/(s^«)-/o(sM] = 

n fc=o p 

r rn i 

=) - Z (/(^'‘m)-/o(5'‘m)) Mdm < 

(,Jl«)i = 0 j J 

^ Z^ \f{S'‘u)-fo(S'‘u)\'’n(du)y‘’ = 






Utilizing Fatou’s lemma we obtain 

r r 1 «- 1 p ■) i/p 

il/o(«)-/*(«)llp = i lim - Z [/(■S'‘m)-/o(5'*'m)] < 

U "t=o J 

- Z [/(S'‘m)-/o(S'‘m)] 

^ fe=0 p 

Next, since /o(u) is bounded all its means are bounded by the same 
constant. Therefore in the expression 

\n-i r r 1 ^ ^ 

- z -/*(«) =] - Z /o(S'‘m)-/o(m) fi{du)\ , 

^ k=0 p U ^ k=0 ) 



in view of Lebesgue’s theorem, it is permissible to pass to the limit as 
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n-^oo under the sign of the integral. Hence this expression tends to 
zero, and for n sufficiently large it becomes less than 5. Therefore, 



1 

n 



n- 1 



1 

r=( 



f(S^u)-r{u) 



<3(5, 



n^nQ = rio{S), 



where the number d can be chosen arbitrarily small ((5>0). The proof 
of (15) is thus completed. □ 



Definition 3. The set Ae^ is called SAnvariant if jx[[S ^A) AA) = 0, 
where A denotes the symmetric difference of sets. 

It is easy to verify that the class of all S-invariant sets forms a g- 
algebra of g-measurable sets. Next, if g(u) is an S invariant function, 
then the sets {u:g{u)'^c} {u:g(u) = c} are iS-invariant. On the other hand 
if A is iS-invariant, then Xa(^) is an A^-invariant function. Denote the a- 
algebra of -S-invariant sets by /. Let g(^)= 1. We consider {%, g, jj] to 
be a probability space and let the symbol E denote the integration with 
respect to measure g. 

Corollary 2. /*(w)= E{/(w) | /} (mod/x). Clearly E{/(w) | /} is an ^S-in- 
variant function. Therefore to prove Corollary 2 it is sufficient to verify 
that for an arbitrary bounded 5-invariant function g{u) 

Eg{u){f*{u)-E{fiu)\l}) = 0 

or that E{g{u) /* (m) — g{u)f (u)) = 0. The latter however follows from (12) 
since 

{g(u)f{u))* = lim- ^ g(S'^u) f(S'‘u)=g(u) f*{u) (mod//). □ 

w k=0 



Ergodic stationary sequences. We now return to stationary sequences. 

Let teT] be a stationary sequence and £, P} be its nat- 
ural representation. 



Corollary 3. If f is a measurable function in ® and E/(c^ (0), ^ (1), . . . , 

00, then with probability \ as n-^oo, 

where I is a o-algebra of events in d, invariant under shift-transformations. 

Consider an arbitrary event Ae^ and a sequence of events obtained 
from A by means of the “shifts”: A, S~^A, S~^A,.... If is the indi- 
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cator of the event S^A, then x„(^ = 0, ± 1 , . . .) forms a stationary sequence 
1 

of random variables and - T is the frequency of occurrences of event 
n fe=o 

A evaluated from a single realization of the sequence t = 1,2,...} 

under « — 1 consecutive shifts from the “origin”, 

1 V 

- L %fe = • 

n k=i n 



In view of the Birkhoff-Khinchin theorem there exists with probability 1 
the limit 



lim and En(A)=P{A). 

n^oo n 

Quantity n{A) may be called the empirical probability of event A. This 
quantity is a random variable and is determined from a single realization 
of the infinite sequence ^ = 0, 1, 2,...}. The question arises: under 
what circumstances is the empirical probability n{A) independent of 
chance and coincides with the probability P{A)1 

Stationary sequences which possess this property are called ergodic. 
The following definition is more general : 

Definition 4. Let g, g} be a probability space, let be a measure- 
preserving transformation of % into itself, v„(^) = v„(^, w) be the number 
of terms of the sequence {m, Su,..., falling into the set A. The 

transformation S is called ergodic if for any ^ g g 

lim = fji (A) (mod ju ) . 

n-*^Go ri 

The transformation S is called metrically transitive if any ^'-invariant set 
has measure 0 or 1 . 

Theorem 2. In order that a transformation S in the probability space 
{ U, g, g] be ergodic it is necessary and sufficient that one of the following 
two conditions be satisfied: 

a) S is metrically transitive 

b) For any "^-measurable g-integrable function f (w), the function 

/*(«) = lim- ^ /(S'‘u) 

H k = 0 



is constant with probability 1 . 

Proof Let A be an *S-invariant set, 0 < < 1 . The sets A, SA, S^A , . . ., 
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differ on a set of measure 0 and = (mod/x). Hence lim 



v.(^) 



cannot be a constant (mod/i). Therefore ergodicity yields metrical 
transitivity. Now let S be metrically transitive. Since the function /*(w) 
is S'-invariant the symmetric difference of the sets 



S ^ {w:/*(w)<x} = {m:/*(*Sw)<x} and {u:f*(u)<x} 



is of /i-measure 0. From here it follows that fi{u: /*(w)<x} = 0 or 1 for 
any real x, i.e. /*(w) = const (mod /i). Therefore a) implies b). Finally, the 
condition of ergodicity is a particular case of condition b) namely the 
case when / (w) is the indicator of a certain event. □ 

We now present a few corollaries of ergodicity. 

Let P} be a natural representation of a stationary sequence 

^{n), S be the shift transformation in = P}- 

It follows from Corollary 1 of Theorem 1 that for arbitrary functions 
f{u) and g{u) in ^2 



lim 

CO 



- Z P(du)= [ f*{u)g{u) P{du). 

^ k=0 J 



(16) 



We say that the sequence {{(n), n = 0, ±1,...} is ergodic if the trans- 
formation S is ergodic. Set g{u) = rj, f{S^u) = Cj, and assume that the 
initial stationary sequence {^(«), n = 0, +1,...} is ergodic. Relation (16) 
now becomes 



limElV^^^ = ECoE^/. (17) 

n-^co n = Q 

Let 

0(m) = Zb(m). /(m)=Za(4 a and 5e€. 

It follows from (17) that 

1 "-1 

lim- Z P(5^“^^'^5)=P(^) P(fi) ( 18 ) 

n-^oo n 

or (if P(B)#0) 

1 

lim- Z P(S~'‘^ I B)=P(^), (19) 

n->c» n 0 

where P (S'~^^ | B) is the conditional probability of the event S~^A given 
B. 

Lemma 5. The validity of equality (18) {or (19)) for any sets A, Xe^ is 
equivalent to ergodicity. 
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It is sufficient to show that (18) implies ergodicity. Let C be an arbi- 
trary ^S-in variant event. Set A=B=C. This equality then becomes 

P (C) = (C), i.e. P (C) = 0 or 1 and the lemma follows from Theorem 2. □ 

Equation (19) has the following probabilistic meaning. Let A and B 
be two events in £. If event A is shifted indefinitely in time, then on the 
average, events and B become independent for any event B. 

Condition (19) may be replaced by a more stringent requirement 

lim ?{S-^A\B)=?{A), (20) 

n-* oo 

which is called the mixing condition. Condition (20) is a particular case of 
equality 

limEC„^/=ECoEf/, (21) 

n~* CO 

where C„=f(S''u), rj = g(u), f(u) and g(u) are arbitrary functions in J^ 2 - 
On the other hand (20) implies (21) for simple functions / and g. Approx- 
imating arbitrary functions / (u) and g (u) in if 2 by means of sequences of 
simple functions /„(w) and g„{u) converging in if 2 to f{u) and g{u) 
respectively, it is easy to see that the mixing condition is equivalent to 
condition (21) (f(u) and g{u) are arbitrary functions in if 2 ). On the other 
hand it is sufficient to check condition (21) for a certain set of functions 
whose linear span is everywhere dense in if 2 . Indicators of cylindrical 
sets can be chosen as such a set of functions. 

Consider a sequence {^„,n = 0, ±1,...} of independent identically 
distributed random variables such that E|(^„|<oo. Such a sequence is 
stationary. In view of Birkhoff-Khinchin’s theorem 

(mod P) , 

Evidently, the random variable does not depend on any finite set of 
variables ^ 0 ? •••? Therefore it is measurable with respect to 

lim ill view of the zero-one law is a constant, i.e. = c(mod P), 

and moreover c= E^. We thus obtain the following theorem: 

Theorem 3. (The strong law of large numbers). If = ± 1, ...} is a 

sequence of independent identically distributed random variables and 
E|(^„| < 00 , then with probability 1 

lim-f ( 22 ) 

n~* CO ^ k = 0 

The theorem just proved is a corollary of the ergodicity of a sequence of 
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independent identically distributed random variables. One can, however, 
prove a stronger result - namely, that the shift operator in is a mixing 
with respect to the measure induced in by a sequence of independent 
random variables. This in turn follows from a more general assertion. 
Let {C„, « = 0, +1, ±2, ...} be a stationary sequence of random elements 
in {k , ®}, be the cr-algebra generated by the random elements 

+ 1 , . . . , 5oo = n We say that the zero-one law is applicable 

n 

to the sequence {(^„, « = 0, ±1,...} if the cj-algebra 5oo contains only 
those events whose probability is 0 or 1 . 

Theorem 4. If the sequence {(^„, « = 0, +1,...} satisfies the zero-one law 
then the shift transformation is a mixing. 

Set 1 g„}. The sequence n= ... -k, -/c+1 ..., 0}, 

= is a martingale (cf. Theorem 4, Chapter II, Section 2) and 
P{5 I is its closure on the left. Since cr-algebra g_oo is trivial, it 

follows that P I 5- 00 }== const = P(5) (mod P). In view of the theorem 
on convergence of martingales (Theorem 1, Corollary, Section 2 in 
Chapter II) lim P {B | — P(^) with probability 1. Let ^ be a cylinder 

over the coordinates « = 0, 1,2,... Then Therefore for oo 



P{BnS~^A) = 



S~^A 



P{B I gj P{du)^P{B) P(S-M)=P(B) P(^). 



Clearly this relationship holds for any Ae^. This yields relation (21) 
as was noted earlier. The theorem is thus proved. □ 



Consider as another example of a process satisfying the mixing 
condition the stationary Gaussian sequence with correlation coefficient 
approaching zero. Let n = 0, ±1, ±2,...} be a stationary Gaussian 
sequence, Ec^„ = m, 



^{L-m){^o-m) = R„-,f{u)=f{xo,Xi,...,Xp) and g(u) = g{xQ,Xi,...,Xp) 



be bounded sufficiently smooth functions of p + 1 variables possessing 
an absolutely integrable Fourier transforms /*(2o, >lp), ^*(2 q, A p). 
Then 



00 

= E 

— 00 



00 



1 ? • 







p 



+ Z 

k=0 



X 



xf^ (2-0? • • •? '^p) 0 (Mo? * * *’ /^p) d^Q . . . dXpdfiQ . . . dfip 
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00 00 



exp 



p p 

^n + k- rVr 

^k,r = 0 k,r = 0 



-00 — 00 



xf* (Ao, . . Ap) gf* (Ho, ...,fip)d2.o... dXp dfiQ... dfi^ 



If = then approaching in this relation to the limit as n-^co 

n->- 00 

we obtain 



n->- 00 

= (23) 

Since the class of functions/ and g for which the last relation has been 
proved is everywhere dense in if 2 ? follows that relation (23) is valid 

for arbitrary / and g belonging to if 2 - have thus proved the following 

result. 

Theorem 5. A stationary Gaussian sequence with correlation coefficient 
as n^ cc satisfies the mixing condition. 

We now present a number of corollaries and remarks related to 
ergodic properties of Markov chains. 

Consider an irreducible Markov chain with a countable number of 
states. It possesses an invariant initial distribution if and only if it is 
positive-recurrent (Corollary 2 of Theorem 15, Section 5). In turn for 
the last property to be valid, it is necessary and sufficient that the system 
of equations 



Y.^kP{kJ)=Xj 

k 

possess a non-trivial absolutely summable solution. The solution of this 
system satisfying condition provided it exists, is of the form 

k 

Xk = Vk= lim ^ X! k). (24) 

N-*oo ^ n=l 

Moreover, if the chain is also aperiodic, then 

Xk = Vk=limp^"^{J,k) 

00 

(cf. Section 5, Theorem 15). Assume that the chain is irreducible and 
positive-recurrent. The invariant initial distribution for this chain is 
unique and the corresponding stationary Markov process is {^{t), / = 0, 
± 1, ...}. The condition of ergodicity of this process may be represented 
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in the form 



Vr T P{C(i) = iuC{ 2 .) = i 2 ,--;^{s) = is, 

N-*oo „ = 1 

(^(«+l)=y'i,..., ^(n + r)=y,} = P{<^(l) = /i,... (^(j) = 4}x 

xP{a\)=j\,...,ar)=Jr}- (25) 

Indeed, on one hand this condition is a particular case of condition (18); 
on the other hand, however, it is easy to verify that (25) implies (18) for 
arbitrary cylinders A and B in Furthermore condition (25) is equiv- 
alent to equalities 

1 ^ 

'’iYr Z P^"HiJ) = ViVj, 

iV->00 2V „ = 1 



i.e. to equalities (24). 

Thus the following theorem holds : 

Theorem 6. A stationary process which corresponds to an invariant initial 
distribution of an irreducible, positive-recurrent Markov chain is ergodic. 

Remark. One can verify analogously that the mixing condition for a 
stationary Markov process reduces to the requirement lim k) = v^, 

n^oo 

therefore a stationary process which corresponds to a positive-recurrent 
and aperiodic Markov chain possesses the mixing property. 




Chapter III 



Random Functions 



§ 1. Some Classes of Random Functions 

Gaussian random functions. Definition 1. A real random function ^{x), 
xeX is called Gaussian if for any integer 1 and any k= 1, 2, n, 
Xf^eX, the sequence <^(x 2 ), { (x„)} has a joint normal distri- 

bution. 

It follows from the definition, that the characteristic function of this 
distribution is of the form 

J(xi, X2,...,X„,u\...,u”) = 

=exp|i ^ (1) 

I k=l k,r=l J 

where the constants and bj^ satisfy 

ak = ^^{Xk), hr=^Ux^)-a^)(^(x,)-a,). ( 2 ) 

Thus all the marginal distributions of a Gaussian random function are 
determined by two real functions - the mean value a{x) and the correla- 
tion function b{xi, X2) 

a{x)=Ei(x), b{xi, X 2 ) = E(^{xi)-a{xi)){i{x 2 )-a{x 2 )). 

The correlation function b{x, y) possesses the following properties: 

1) b(x, y)=b{y, x), 

2) for any n, any real numbers Uj, and points x^eX 

n 

^ b(x^,x,)u^u,^0. 

k,r=l 

Real functions possessing these properties are called positive-definite 
kernels on X^. 

This definition is equivalent to the requirement that for any x^ and 
x^eX the matrix |ih(x;^, x^)|| (/c, r= 1, 2, n) is real, symmetric and non- 
negative-definite. 
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Note that for an arbitrary set X, real-valued function a{x), xeX and 
a non-negative-definite real kernel X 2 ) on X^, there exists a Gaussian 

random function for which a{x) is its mean value and X2) is its 
correlation function. To prove this assertion consider a family of distri- 
butions ^^(*), n=l, 2 ,..., with the characteristic functions 

given by relations (1). It is easy to verify that this family satisfies com- 
patability conditions. It now remains to apply Kolmogorov’s theorem 
(Theorem 2, Section 4, Chapter 1). 

A vector Gaussian process with real components is defined analogous- 
ly. Let <^(x), xeX be a random function with values in the m-dimensional 
space This function is called Gaussian if the joint distribution of 
all components of the sequence for any 1 and 

any x^eX is normal. The corresponding characteristic function is of the 
form 



J (xi, . . x„, uf , = 

C m n n m 

=exp<^/X Z Z Z X,) 



k,l=l r,s=l 



To simplify this expression we introduce vectors = 
a(x) = (a^(x), ..., ( 2 '”(x)) with values in and the matrix b{xi, X 2 ) with 
elements (xi, X 2 ) r, s = l, 2, ..., m. Then the previous expression be- 
comes 



= exp|i Y Z 

f fc=l fc,Z=l J 

Here the vector function a(x) may be arbitrary and the real matrix 
function b{xi, X 2 ) should be symmetric and should satisfy the condition: 
for any integer n^l, any x^eX and 



z 



fc,Z=l 



(b(Xfc, Xi) 



( 3 ) 



The converse is also obvious: Given an arbitrary function a{x) with 
values in and a matrix function fo(xi, X 2 ) satisfying condition (3) there 
exists a Gaussian random function (^(x)=(^^(x),..., (J"*(x)) for which 

a^^\x) = E^^{x), 

b""{xi, X2) = E(f (xi)-a''(xi)) (f (x2)-u*(x2)). 



In certain problems, moments of a Gaussian random function may be 
useful. These can be obtained from the series expansion of character- 
istic functions. We shall confine ourselves to central moments of a scalar 
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random function. Set 

a(x) = 0, u = {u^^\ 

B = \b{xj^, Xy}\, k,r=l,...,n. 

Then 



J{xi,X2,...,x„,tu) = e 



— (Bu, u) _ 



t t 

= l-—{Bu, u) + j^(Bu, uf + ... 









from here it follows that 



2n 



E Y. ^ i^k) ={2n-l)\\ {Bu, uf 

\k= 1 



and 



\2n- 1 



= 0 . 



(4) 



We introduce the n-point moment functions 

>njU2-ui^U ^„) = EK(^l)F F(^2)F - FW]-'”- 

The quantity Ji +72 + • • • +7« is called the order of the moment function. 
Moment functions of odd order are equal to zero : 

n 

■■;X„)=0 for XA = 2 s - 1 . 



Formula (4) can be written in the form 



din 



1 



nii 






Moment functions of the second order coincide with the correlation 
function 

Wii(xi, X2) = b{xi, X2), m2(x) = mii(x, x) = b{x, x). 

For moment functions of the fourth order the following formulas are 
available : 



m^{x) = 3 b^{x, x), m22(xi, X2) = 2 fe^(xi, X2), 

X2} = 3b{xi, Xi) b{xi, X 2 ), 
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>n 2 ii{Xi, X2, X 2 ) = b(xi, Xi) b{x 2 , X3) + 2 fe(xi, X2) b(xi, X3), 
"Jllll(^l. ^ 3 > ^4) = 

= b(xi, X2) b{x2, X4) + b(xi, X3) b{x2, X4) + b(xi, X4) b{x2, X4). 

Generally the following relation 

x„)= X n b{xp, X,) (6) 

holds. The structure of this formula can be described as follows : we write 
down the points Xj, X2, into a sequence where is written 7^ times. 
This sequence is then subdivided into arbitrary pairs. The product in the 
r.h.s. of formula (6) is taken over all the pairs of this subdivision and the 
sum is taken over all the subdivisions (those pairs which are permuta- 
tions of one another are counted once). This assertion follows directly 
from formula ( 5 ). 

Complex-valued Gaussian random functions are considered in a 
number of problems. Their definition involves a feature which distin- 
guishes them from general vector Gaussian functions with real compo- 
nents. We shall discuss only functions {^(x), xeZ} with values in 
Set l^{x) = ^(x)-\-irj{x), where c^(x) and rj(x) are real. 

Definition 2. A random function {C(x), xgZ} is called a complex 
Gaussian random function if the real vector function {((J(x), ^(x)), xeX} 
is Gaussian and E(C(x) — a(x)) (C(7) — a(j)) = 0 , a(x)= EC(x), for any x, 

It may be assumed without loss of generality that a(x) = 0 . It is easy 
to verify that the condition EC(x) C(y) = 0 is equivalent to conditions 

W i (7) = {x) U (x) n{y)=- E<^ (y) rj (x) . ( 7 ) 

On the other hand if equalities ( 7 ) are satisfied, then 

b{x, y)=EC(x)Z{^ = 2(bii{x, y)-ibi 2 {x, y)) , (8) 

where bn(x, y) = E{(x) ^{y), y)= Ec^(x) rj{y). From conditions ( 7 ) 

it follows in particular that ^)= Ec^(x) f/(x) = 0, and since (c^(x), 

rj{x)) have joint Gaussian distributions, variables ^{x) and rj{x) are in- 
dependent. If we now put C(^) = ^(^) then, as it is easy to verify, the 

variables ^(x) and (p{x) will be independent, (p{x) having the uniform 
distribution on ( — tc, n) and ^(x) has the density given by 

-f-^e M>0, a^(x) = V^{x) = bii(x, x). 

In relation (8) the function fei i (x, y) is a non-negative - definite kernel, 
and y) possess the property that y) = —b\2{y^ ^)- Utilizing 




150 



Chapter III. Random Functions 



Consider the “truncated” variables a„^(x) and their moments 

“n/i W = Xz (“nk W) ^nk W . <k W = W > 

Kkixu ^2)=Efe(xi)-a^(xi)] [art(x2)-a^*(jC2)]> 
where 8>0 and ^^[x) is the indicator of the interval ( — 8, s). 

Theorem 2. Let the functions a„ 2 (^), • ••, <^nmnW mutually inde- 

pendent for each n and satisfy conditions : 

1) for any 

m„ 

Z as n-^co; 

k= 1 

2) for some 8 = 8q = £o W > 

m„ rttn 

E Z Kki^i, X2)^b(xi, X 2 ) (9) 

k=l k=l 

as 00. Then the marginal distributions of the random function rj„{x) as 

«->oo converge weakly to the corresponding marginal distributions of a 
Gaussian random function with mathematical expectation a{x) and corre- 
lation function b[x^, X 2 ). 

Processes with independent increments. Let T be a finite or infinite inter- 
val closed on the left, a = min T> —00. 

Definition 3. A random process (^), teT} with values in is a process 
with independent increments if for any T, < ^2 < • • • the random 
vectors ^[a\ ^{t^ — ^[a\...,^{t^ — ^{t^-fj are mutually independent. 
The vector ^{a) is called the initial state (value) of the processes and its 
distribution is called the initial distribution of the process. 

To define a process with independent increments in the wide sense it 
is sufficient to define the initial distribution Pq {B) and a family of proba- 
bilities P(/, h, B) (t^O, /z>0, where S'” is the a-algebra of Borel 

sets in and P(t, h, B) is the distribution of the vector (^[t-\-h) — ^{t). 
Indeed, if these distributions are given, then arbitrary joint distributions 
of vectors ^(0 uniquely determined by the formula 

p(^^n mk)^Bfj= 



= Po(^yo) P(0, P(t^,t2-t^,dy2)... 



Bo 



Bi -yo 



Bi-iyo + yi) 



Bo~iyo + ■•• + yn) 






( 10 ) 
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this fact it is easy to verify that function b\x, y), determined by formula 
(8) in which y) and 612 are arbitrary functions possessing 

these properties, satisfies the relation 

n 

k,r=l 

for arbitrary n, Xj,eX and arbitrary complex numbers Functions 
possessing these properties are called non-negative - definite kernels on 

Theorem 1. For any non-negative-definite kernel b{x, y){x, yeX) there 
exists a complex Gaussian random function C{x) for which EC(x) = 0 and 

EC{x) C{y)=b{x, y). 

To prove this assertion we introduce a real matrix function of second 
order y)=\\bi^{x, j)|| (/, k = l, 2) putting 

bizix, y) = -b2i{x, y)= -W{x, y). 



where b'{x,y) = KQb{x,y),b”{x,y) — \mb{x,y). Since b{x,y) is a non- 
negative-definite kernel, b{x, y) = b{y, x) and it follows from here that 
1 (y, x), bi2 7) = — ^12 •^)* Construct a two-dimension- 

al Gaussian random function ({ (x), rj{x)) with correlation matrix B{x, y). 
In view of the previous remarks ^ (x) = <^ (x) + ir\ (x) is a complex Gaussian 
random function and 

ECWn^=2(^>ii {x, y)-ibi 2 {x, y))=b'{x, y) + ib"{x, y)=b(x, j). □ 

The fact that Gaussian random functions play an important role in 
practical problems may often be explained as follows. Under very 
general conditions the sum of a large number of independent small (in 
magnitude) random functions is approximately a Gaussian random 
function independently of the probabilistic nature of the components 
(summands). This assertion is the so-called theorem on normal corre- 
lation which is a multivariate generalization of the central limit theorem. 
We now present one version of this theorem. 

Let a double sequence of random functions {a„^(x), xgX}, k=l, 2 , 
. . ., m„, n = 1, 2, . . . be given. Set 






tln{x)= Z 

fe=l 



<^nk{x). 
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Here B — z denotes the set {x:x^y — z, yeB}. As far as the initial distri- 
bution is concerned it may be chosen arbitrarily. On the other hand, 
one cannot guarantee that a process exists with independent increments 
which corresponds to an arbitrarily defined family of distributions 
P(t, K B). 

In order that this be the case, it is necessary and sufficient that 
P{t,h,B) possess the following property: for an arbitrary n and any 
a = tQ<t^< ... <t„ = t-\-h, P (t, /i, B) is a distribution of sums of indepen- 
dent random vectors where is distributed according to 

B). 

Indeed if this condition is satisfied, then the family of distributions 
(10) satisfies the compatability conditions. Therefore Kolmogorov’s theo- 
rem is applicable and there exists a random process with marginal dis- 
tributions (10). The form of these distributions indicates that the process 
has independent increments. 

It is convenient to study processes with independent increments using 
characteristic functions. 

Set 



Function J(t, h, u) is called the characteristic function of a process with 
independent increments. This function completely determines the joint 
distribution of the differences 

ah)-iia), m-ati),-, ( 11 ) 

Indeed the joint distribution of the sequence of vectors (11) has its char- 
acteristic function J(ti, ..., w") equal to 

n 

J{ti, t„, u\ u")= n tik), 

k=l 

Therefore to define a process with independent increments in the wide 
sense it is sufficient to define J(t, h, u) (in addition to Po(^))- The neces- 
sary and sufficient condition on P{t, /i, B) stated above means that the 
characteristic function J{t, h, u) considered as a function of the interval 
[t, t-\-h) must be multiplicative: 

J{t, /ii +/i 2 , u) = J{t, /ii, u) J(t + /ii, /i 2 , u). 

In turn this condition is necessary and sufficient in order that J{t, h, u) 
be the characteristic function of a process with independent increments. 
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Definition 4. A process with independent increments is called homogene- 
ous if the differences — are distributed independently of t, i.e. 

P{t, h, B)=?{h, B). A homogeneous process is called stochastically con- 
tinuous if 



lim P(/2,5,) = 0 

h^O 

for any sphere 5'£ = {x:|x|<e}, e>0. 

(See Section 2 for additional details concerning the condition of 
stochastic continuity and its significance.) If a homogeneous process is 
stochastically continuous then for any t the difference ^{t + h) — ^{t) 
converges in probability to zero and hence the distribution of ^{t + h) — 
— ^{t) is weakly convergent to zero (as In view of the continuity 
of the correspondence between distributions and their characteristic 
functions it follows that stochastic continuity is equivalent to the fol- 
lowing property: for /zj,0 J{h, w)->l uniformly in any bounded region 
\u\^N. 

We note a few properties of characteristic functions of homogeneous 
stochastically continuous processes with independent increments. 

a) The characteristic function of a homogeneous process with indepen- 
dent increments satisfies equation 

J{h^-\-h 2 , u) = J{h^, w)/(/? 2 , u). (12) 

In particular for any integral n 

J{nh, u) = [^J{h, u)Y • 

b) The characteristic function J{h, w) of a homogeneous stochastically 
continuous process nowhere vanishes. 

Indeed, for an arbitrary u one can find t^ such that \J{h, w)|^i for 
0<h^tQ. If Ms arbitrary and t = to(n-{- 6), where 0 ^ 0 < 1 , then /(/, u) = 
=J{ton, w)x/(/q0, w) = [/(^o, w)]"x/(/o0, u); thus \J{t, w)|^(i)"'^^ Since 
J{h, u)-^l as /z jO uniformly in an arbitrary sphere |w|^A, it is possible 
to define a single-valued function g^ {t, u) = ln/(/, u) in the region ? e [0, h\ 
|w| ^ A, h = h{N), and this function is also jointly continuous in its vari- 
ables t and u in the region under consideration. It follows from (12) that 
g^ (/, u) satisfies equation 

gdh+h, u)=gi(tu u)+gi(t2, «), 

t^-{-t2^h. 

Therefore g^(t, u) = tg{u) and J{t, = It is easy to verify that the 
last equality should be satisfied for all t and u. Indeed, if this equality 
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holds for given u and for all t>0, then for an arbitrary t 



Hence 



J{t, u) = 





n 

— ^tg(u) 



for 




J(t, = 



(13) 



where g{u) is a single- valued continuous function. 

This simple result completely characterizes the dependence of the 
characteristic function J (t, u) on t. Clearly, that characteristic function 
of form (13) satisfies condition (12). The structure of function ^(w) remains 
to be determined. It follows from the above that g{u) can be arbitrary 
provided that is the characteristic function of a certain distribution 
for each t. It follows from (13) that 

= (14) 

40 t 

and the convergence is uniform in every bounded sphere \u\^N, 
0<N<co. 



Theorem 3. Let J{t, w), be a family of characteristic functions 

such that the limit (14) exists uniformly in an arbitrary sphere |w|^7V, 
7V>0. Then there exists in S'"} a finite measure T1{B), a non-neg- 
ative-definite operator b defined in 01^ and a vector a such that 



g(u) = i{a, u)—j(bu, u)-\- 



+ 






i{u, z) 

1 + N^ 




n{dz). 



(15) 



Proof Let {Qf' ), S'”} be the distribution corresponding to characteristic 
function J (t, u). Set 

B 



It will be shown below that the family of measures {77j( • ), t > 0} is weakly 
compact. Choose a sequence such that converge weakly to a 
certain measure 77' on S'”. Next 



J{t, u) — l 
t 



j (e‘<“’^>-i)^^^n,(dz)= 



= iAt{u)-^Bfu)+ f{u,z)rit{dz), 



(16) 
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where 



^r(«)= I ^ n,(dz), B,(u)= I ^ 77,(t/z), 

^ V l + l^l' 21 + \z[ 



If we define / {u, 0) = 0, then / (w, z) becomes a continuous and bounded 
function. Therefore 



lim f{u,z)n^^{dz)= f{u,z)n'(dz). 



Since the limit in the l.h.s. of equation (16) exists at t = as n ->oo, the 
limits 



lim (u) = a{u), lim (u) = B (u) 

also exist, where, moreover, a(u) is a linear function, B{u) is a positive 
definite quadratic form i.e. a{u) = {a, u) and B{u) = {b'u, u) where b' is a 
positive-definite symmetric operator. Approaching through sequence 
to the limit in (16) we obtain 



g{u) = i(a, u)-^{b'u, u) + 



f{u,z)n'(dz). 



J 

01 ^ 



(17) 



Let n{A) — Il'{A — {0}) ({0} is a singleton containing point 0). In the 
r.h.s. of equality (17) the measure II' { ) appearing in the integral may be 
replaced by measure /7( ). 

On the other hand, the integral 







n{dz) 



exists and represents a certain positive-definite quadratic form {b"u, u). It 
is easy to verify that {b'u, u) ^ {b"u, u). Therefore the operator b = b' — b" 
is a positive-definite symmetric operator. We thus obtain 



g (u) = i {a, u)—j {bu, u) + 






f( \ 



n{dz). 



which proves (15). 

We proceed to the verification of the weak compactness of the family 
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{n„ t>0}. It is required to show that 

a) b) lim hm77,{S^} = 0, 

N-*-co t j 0 

where Sj^ = {z: \z\ > N}. 

Let \u\^N^,Ni be arbitrary. It follows from the conditions of the 
theorem and (16) that for any ^ > 0 a to = can be found such that 



-Re^(u) + ^^J 

Si 



1 — cos(m, z) 



n,(dz), 



t<to 



(18) 



and for 1 



— Re^(M) + ^^ 



[1 — cos(u, z)] i7,(dz), t<to- 



(19) 



Sc 






Since follows from (18) that 



— Re^(w) + (5^ 



(u, zf (u, zf 
2! 4! 



Lj7,(dz). 



( 20 ) 



To obtain the required bounds the values of the following integral are 
needed: 



J(q)= 



Ji^(q) = ^ (u, zf du, k = 2,4. 



These are: 



, , [IuqY^ , ^ 

•f(e)= 77- ImuiolA), 
\ |z| / 



h{Q)= — — r-> U{e)= 



( 21 ) 



/ m 

2TI- + 2 



m 

4r(-+3 



Integrating inequalities (19) and (20) with respect to and dividing 



them by the volume where 5^ = 0^^"*, = 



% 



mil 



m 

n-+i 



, we obtain 



^mQ‘ 



* Regi(u) dw + <3^ ^ ^ 



i-4-^|n,(dz) 



2(m + 2)\ 4(m + 4) 



( 22 ) 
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and 






Re g{u) du + d^ 



> 



m \ / 2 

l-n-+l ^ ImdQlzl) 

2 / Veiziy 



nXdz). (23) 



Choosing in (22) the value of q from the condition g^ = 2{m + 4) and taking 
Ni>q WQ obtain 



i7, 



,(Si)^2 I RQg{u)du 



Since the function is bounded, one can choose for any oO the 

value ^ = ^1 from condition 







sup|/^/ 2 (x)|. 

X>0 



We thus obtain 







RQg{u)du , 



(24) 



which shows that U^(^'^)<K. Finally noting that for ^ 0 we have 



^mQ 



m 



RQg{u) du-^ g{0) = 0, 



we first choose ^ = ^2 sufficiently small so that the l.h.s. of inequality (23) 
is less or equal to 23 and then choose c = N = N^ in such a manner that 
equality (24) is satisfied. We then obtain 

n,{Sj<43, 

and these inequalities are valid independently of te[0, Iq), to = ^o(^i» ^)- 
The theorem is thus proved. □ 

From the results obtained follows 



Theorem 4. If ^{t), t^O is a homogeneous stochastically continuous pro- 
cess with values in then the characteristic function J{t, u) of the differ- 
ence (5 + / ) — (^ (^) is of the form 

J(t, = 

where g{u) is given by formula (15). 



( 25 ) 
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Consider now a few particular cases of formula (25). 

a) b = 0,n{B) = 0. 

In this case J(t, = which corresponds to the characteristic 

function of a degenerate distribution concentrated at point Thus 

= + with probability 1 and point ^{t) moves uniformly with 

a constant velocity a. 

b) n(B) = 0. 

In this case the increments + — are normally distributed 
with the mean a and correlation matrix bt so that if for example (^(0) = 0, 
the process ^{t) is Gaussian. In Section 5 of the present Chapter it will 
be shown that in this and only this case the process with independent 
increments is stochastically equivalent to the process with continuous 
(with probability 1) sampling functions. The process under consideration 
is called Brownian motion. 

As it is known, if one observes a small particle of colloidal dimensions 
immersed in a liquid through a very powerful microscope then he will 
notice that such a particle is in constant motion and its path represents 
a very complicated broken line with randomly oriented segments. This 
phenomenon is due to the collision between the molecules of the liquid 
and the colloidal particle. The measurements of the particle are large as 
compared with molecules of the liquid and a huge number of molecules 
collide with the particle during a time period of one second. The result 
of each single collision is impossible to detect. The motion of the particle 
is called Brownian motion. As a rough approximation it is assumed that 
the changes in position of the particle as a result of the collision with the 
molecules of the medium are independent and Brownian motion is con- 
sidered as a continuous process with independent increments. In view of 
the above, such a process is Gaussian. If ^{t) is one-dimensional, b = l, 
a = 0, then the Brownian motion is called a Wiener process. 

c) a = 0, h = 0, the measure 77 represents a mass of magnitude q con- 
centrated at point Zq. 

In this case the characteristic function (25) is of the form 



J(t, w) = exp 



qt(l + \Zo\^) 
U |2 



(«, zq) 



i{u, Zo)' 

1+N^, 



( 26 ) 



It is easy to verify that the increment can be represented in the 

form 






l^ol 



where v(t) is a Poisson process with mean value Ev(t) = 



q(l + |zol^) 
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d) Let b=0 and the measure 77 satisfy 

f n(dz) 

J <»■ P’* 

In this case g{u) can be represented in the form 

g{u) = i{a,u) + q ^ 1) 77o(dz), (28) 

where ^>0 and FIq is a probability measure on S'"}. The inter- 
pretation of this measure is as follows : We have 

J(t, = y I e‘'"’^>77o(dz) 

n=o n\ LJ 

which represents the characteristic function of the sum 

where ^re independent and identically distributed random 

vectors with values in distributed according to FIq, a is a constant 
vector, v{t) is an integer-valued random variable, independent of the 
family k=l, 2, ...}, obeying the Poisson distribution with parameter 
qt: 

P{v(t) = „} = e-'^. 

This process is called the generalized Poisson process in 

Note that for any function g{u) defined by formula (15) a sequence 
convergent to it of function g{u) of form (28) can be constructed. Since 
the members of the sequence determine characteristic functions of cer- 
tain distributions, the function where g{u) is an arbitrary function 
of form (15) is the characteristic function of a certain distribution. We 
thus have the following 

Theorem 5. In order that the process ^{t) be a homogeneous stochastically 
continuous process with independent increments it is necessary and sufficient 
that its characteristic function be represented by formulas (25) and (15) 
where a is an arbitrary vector, b an arbitrary positive-definite operator and 
n{B) an arbitrary finite measure on S'”) and 77{z = 0} = 0. 

Markov processes. Markov processes play a most important role in 
modern probability theory and its applications. They are studied in 
detail in Volume II. Here we give only the simplest definition of this 
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class of processes. The notion of a discrete parameter Markov process 
was introduced and discussed in Section 4 of Chapter II. 

The notion of a Markov process (a Markov system) is based on the 
representation of a system whose behavior in the future depends only 
on the present state of the system (i.e. does not depend on the past be- 
havior of the system). Let ^{t), te T, where Tis a finite or infinite interval 
of time, be a random process with values in a complete metric space ^ 
and let © be a a-algebra of Borel sets in 

The space ^ is called the phase space of the system, ^{t) is its state 
at time t. The hypothesis of “independence of the future from the past” 
or equivalently “the absence of after-effect” can be most simply de- 
scribed using conditional probabilities as follows : 

= (modP) (29) 

for any and Since the conditional probability 

given a random variable can be regarded as a function of this variable, 
we set 



P{at)€A\^(s)}^P{s,i{s),t,A) (5<0- 

It follows from formula (19) Section 3 Chapter I that for tj < tj < . . . < t„ 
the following equality holds for an arbitrary bounded Borel function 
6f(xi, X 2 ,...,x^{x^e<W, k=\, 2 ,..., n): 

= 1 ti, dy2) j P(ti, yi, h’dyi)--- 

•••| P{tn-uyn-ut„,dy„)g{i(t^),yi,...,y„)(modP). (30) 

In particular if we put g = XB{^ 3 \ where Xb{') is the indicator of the set 
Be©, then it follows from (30) that with probability 1 

P(ti, ^(h), h, = J ^(^2’ y^’ ^3> B) P(ti, (J(ri), t2, dy2). (31) 

The equality obtained has already been encountered in Section 4 of 
Chapter 1 1 as the Chapman-Kolmogorov equations. 

Definitions. A random process ^{t) {teT) with values in ^ is called 
Markov (or Markovian) if 

a) for any t^<t 2 < ... <tn<t, t^eT (/:=!,..., n), teT equality (29) is 
satisfied. 

b) There exists a function P(5, y, t, B\ ©-measurable with respect to 
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y for fixed s, t, B, which is, for fixed s, y and t, a probability measure on 
S satisfying the Chapman-Kolmogorov equation 

P{fi, y, t3, ^} = | P(^ 2 , yi, h, B) P(fi, y, t2, dy^) (32) 

and which coincides with probability 1 with the conditional probabilities 
P(s, (^(s), t, A)=P{i(t)eA I 

Functions P(t, y, s, B) are called transition probabilities of a Markov 
process. 

Therefore, it follows from the definition that the family of conditional 
probabilities (29) is regular and process ^{t) does not depend on the 
“past”. The property of a process expressed by equality (29) is called the 
Markov property or the absence of after-effect. 

We now show that certain stronger assertions can be deduced from 
the Markov property. Applying again formula (19) of Section 3, Chap- 
ter 1 and equality (30), we obtain that for tj^<t2< ...<t^< ...<t^+^, 
tj^eT (/c=l,..., n-hm) 

I iitxX aU} = 

= 1 X 

X ••• ^{^n+m-uyn-iA„+m,dy,)g{yi,...,y^ = 

= E {fif ((? + 1 ^ I { (tj) (mod P) . 

If we set ^(yi,..., y„) = ZB(«)(3^i» • ••» where is a Borel set in then 
the equality which generalizes the Markov property of a process : 

p {K (£„+ 1), . . {(£„+„)] 6 b<"> I ah), ■ ■ ; atj} = 

= p { [<^ (tm + 1 ■ • • , (fm + «)] e 5*"’ U (O) (mod P) 

will follow for any t^<t 2 < ... <t„+^{eT) and any n and m. Denote by 5^ 
the (7-algebra of events generated by the random variables (^(s), seT,s^t, 
and by gf the (x-algebra generated by the variables ^(s), seT, s>t. We 
then have for any cylindrical set Cegf with t^<t 2 < ...t„^t 

p{c I atx),-, m}=p{c I ah)} (modP). ( 33 ) 

Let A be the class of events for which (33) holds. In view of the properties 
of conditional probabilities (Section 3, Chapter I), A is a, 2-class and 
contains the 77-class of events. Therefore yl=)g*. On the other hand. 
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let 91 be the class of events N for which for any 5 g5* 

|p(S|g,)t/P = |p(5|^(t))tiP. (34) 

N N 

In view of (33) all the cylinders from 9i are included in 5^. Since the r.h.s. 
and the l.h.s. of equality (34) are countably-additive functions on 5, the 
fact that they coincide on the cylindrical sets of yields that they are 
identical on We thus have the following 

Theorem 6. For an arbitrary 

P(5|g,) = P(5|^(0)(modP). (35) 

Relation (35) shows that the conditional probability of an arbitrary 
event S, which is determined by the behavior of a Markov process in the 
“future” if the “past” is completely specified, depends on the “present”. 

A family consisting of a probability measure fiQ on S} and of 
transition probabilities P{t,y,s,B) (t<5, t, Re®), satisfying con- 
ditions b) of definition 5 is called a wide-sense Markov process defined on 
r-[0,/?] or 7=[0,oo]. 

The measure jUo is called the initial distribution of the system. 

For an arbitrary bounded Borel function /(yi, - -, y„) of n variables 
y^G^ and for arbitrary tj^eT (/c=l,...,n, 0<ti< ...t„) we set 

[/]= [ ^^o(dyo) f P(0,>'o, ••• 



•••xj f(yi, yz,--; yn) P{tn-U yn-l, tn, dyn) (36) 

and 

= (37) 

where Xa(^) is the indicator of the set T^"^g®”, and ®” is the cr-algebra of 
Borel sets in Note that for an arbitrary Borel function /(yi,..., y„) 
the function 

/l(yi,}'2,---,y„-l) = j f(yi,y2,-;yn)P{t,yn-U^>dy„) (t<s) 

is also Borel, since the integral is represented as the limit of integrals of 
simple functions, and the latter are Borel functions of the variables y^, 
y 2 ? •••? y«-i- In view of the properties of the integral, j^(R^”^) is a mea- 
sure on ®". Clearly the family of measures ^^(R^”^) satisfies the corri- 
patability conditions and in view of Kolmogorov’s theorem (Theorem 2 
of Section 4, Chapter 1) - in the case when ^ is a complete separable 
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metric space - admits a certain representation {D, S, P}, where Q is the 
space of all functions co(t), teT with values in Let ^{t) be an arbitrary 
process stochastically equivalent to {Q, S, P}. We check that 

pmeB I b) (modP), 

i.e. ^{t) is a Markov process with given transition probabilities. For this 
purpose it is sufficient to verify equality 



P{t„,y„, t,B) 






B(«) 



{dyi, dy2,-.., dy„)=P„j,,.„t„,,{B^”^ x B) 



for arbitrary Bg® and t^<t 2 <t^< But this equality 

follows directly from formulas (36), (37) and Theorem 2 of Section 4 
Chapter II. 

Therefore if ^ is a complete metric separable space then a certain 
representation exists for an arbitrary wide-sense Markov process. 



§2. Separable Random Functions 

The basic theorem. Let a random function (,{x)=g{x, co) be defined on 
the probability space ®, P}, where xg^ and the values of the func- 
tion are in a certain measurable space , ©}. We shall assume that 
{Q, S, P} is a complete probability space. 

In many problems events of the form 

{co:^(x)gF for all xeG}. (1) 

play a significant role. 

Unfortunately, if G is uncountable one cannot in general assert that 
event (1) is S-measurable. Nevertheless, it is often required to consider 
random functions for which this event is measurable for a wide class of 
sets F and G. 

The feasibility of overcoming the difficulties connected with the un- 
countability of event G is based on the following remark. Assume that 
there exists in ^ a countable set of points / and a co-set N such that 
P {A} = 0 and the symmetric difference of set (1) and set 

{cd:C{x)eF for all xeGnI}= O {co:C(x)gF} (2) 

xe Gn/ 

is contained in N for all Gg© and Fe%, Then set (1) is measurable. 
Random functions which satisfy the formulated assumption are called 
separable (relative to classes © and g). Intuitively it is clear that in order 
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that a random function be separable the sets in (5 should contain a suf- 
ficiently large number of points of / so that it would be plausible to 
regard sets (1) and (2) as insignificantly different from one another. 

For example if ^ and ^ are metric spaces, ^ is a separable space, 
© is the class of open sets, g is the class of closed sets in ^ and the 
function !^{x)=g(x, co) is continuous for almost all co, then an arbitrary 
countable everywhere dense set in ^ is chosen for I. For this choice sets 
(1) and (2) coincide for each co for which C(^) is continuous. 

In the present section it is assumed that ^ and ^ are metric spaces 
with distances r(xi, X 2 ) and >^ 2 ) correspondingly, is a separable 
space and the separability property of a random function is considered 
relative to the classes © and % of open sets in ^ and closed sets in ^ . 

Definition 1. A random function C {x)=g(x, co) is called separable if there 
exists in SC an everywhere dense countable set / of points {x^}J= 1 , 2 ,... 
and in O a set N of probability 0 such that for an arbitrary open set 
G<^SC and an arbitrary closed set Fa ^ the two sets 



{co\g{x, cd)eF for all xeG}, 

{(D:g{x, w)eF for all xeGnI} 

differ from each other only on the subset of A. 

The countable set I of point Xj which appears in this definition is 
called the separability set of a random function. It turns out that the 
separability property is not a stringent restriction imposed on a random 
function. Under sufficiently broad assumptions pertaining only to the 
nature of the domain of the definition of ^ and the region of values ^ 
of a random function there exists a separable random function which is 
stochastically equivalent to that given. It should, however be noted, that 
when constructing the equivalent separable random function it may 
sometimes be necessary to extend the range of values of the function so 
that it will become a compact set. 

We first present a criterion of separability of a random function. Let 
^ be compact, g{x, co) be a separable random function with values in 
I be the separability set, N be the corresponding exceptional set of 
points CO. 

Denote by V the class of all open spheres of the space ^ with rational 
radii and center at the points of a fixed countable everywhere dense set 
in The class V is countable. On the other hand an arbitrary open set 
G in S’ can be represented as a sum (of a countable number) of spheres 
in V. 

Let A (G, co) be the closure of the set of values of the function g(x, co) 
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where x runs through the set InG and 

A (x, co) = c\A (5, (o) 

is the intersection of all A (5, co) when S runs through the collection of 
spheres in V each containing point x. The family of closed sets A(S, co) 
(xeS) is ‘‘centered”, i.e. an arbitrary finite number of sets of this family 
has common points and in view of the compactness of ^ their inter- 
section is non-void. It follows from the separability of function g{x, co) 
that 



g(x, co)eA(x, co), co^N. (3) 

Conversely if (3) is satisfied for each co^N with P{A/^}=0, then g{x, co) 
is a separable random function. Indeed, if g{x, co)eF for all xelnS, 
where F is a closed set in ^ and S<^V, then A(x, co)eA(S, co) for each 
xeS and consequently g(x, co)eF for all xeS. 

Let G be an arbitrary open set in We represent it as the sum 
G = U ‘S'fc of sets in V. In view of the remark just made, it follows from 

k 

relation 

g(x,co)eF for all xelnG, co^N, 

that 



g{x,co)eF for any xeG. 

We state the result obtained as follows : 

Lemma 1. In order that a random function g(x, co) with values in a com- 
pact space ^ be separable it is necessary and sufficient that there exist a 
set N with P {A^} = 0 such that for co$N inclusion (3) be satisfied. 

Thus to construct a separable stochastically equivalent function for 
g{x, co) it is sufficient to find a function g(x, co) satisfying (3) which co- 
incides with probability 1 with the function g(x, co): 

P{g(x, co) 9 ^g(x, co)}=0. 

Lemma 2. Let B be an arbitrary Borel set in ^ , where ^ is compact. There 
exists a finite or countable sequence of points x^, X 2 , ... such that the set 

N{x, B) = {co:g{x^, cd)eB, /c=1, 2,..., g{x, cd)$B} 

has probability 0 for any xedf. 

Proof. Let x^ be arbitrary. If x^, X 2 ,..., x^ are already constructed, we 
put 

"*fc = supP{gf(xi, a))€B,...,g(x^, (o)€B, g{x,w)iB}. 

xe3C 
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If = then the corresponding sequence is already constructed. If 
mfc>0, let Xfc+i be a point such that 

P{g{xi, (o)eB,..., g{x^, (o)eB, g{xt+i, co)^5}^y. 

Since the sets 

L^ = {a>:g{Xi, co)eB, i=l, 2,..., k,g{x^+i, co)^Bj 
are disjoint, 

00 00 

k=l k=l 



Consequently, m^-^O for k-^oo. Therefore 

P{g(x^, co)eB, k=l, 2,,..,g(x, co)^5} ^limm^^O, 

for any .x, which proves Lemma 2. □ 

The following assertion can be easily deduced from the above : 



Lemma 3. Let Mq be a countable class of sets, and a class consisting 
of intersections of all possible sequences of sets in 5Ro- There exists a 
finite or countable sequence of points X 2 , . . . , . . . and a set N[x), for 

each X, such that 



and 



P{N{x)}=0 



{ar.g(x„, co)eB, n=\, 2,...,g{x, co) ciN[x) 



for any BeW. 

To prove the lemma we proceed as follows: Let / be a countable set 
of points in ^ which is a sum of sequences (x„, 1, 2, ...} constructed 

for each BeySi^ as indicated in Lemma 2 and let N{x)= U N{x, B). 

Be Mo 

U B'emand B=>B', BeWo then 



{(o:g{x„, co)eB\ x„el, g{x, (o)$B}cz 

<=.{u):g(x„, (o)€B, x„el, g(x, a>)^B}cN{x, B)czN{x). 



00 

Moreover, if B' = H Bj^eWfl, then 



k= 1 



{co:g{x„, a))eB', x„el, g{x, co) ^B'}c 

00 

<= U {(o:g(x„,o))eB',x„€l, g{x,co)^Bi,}ci 

k= 1 

00 

c U N{x,B,)<^N(x), 

k= 1 



which proves the lemma. □ 

It will now be easy to prove the following theorem : 
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Theorem 1. (J. L. Doob) Let ^ and ^ be metric spaces, ^ be separable, 
^ be compact. An arbitrary random function g{x, oj), xe^ with values in 
^ is stochastically equivalent to a certain separable random function. 

Proof. We fix a certain everywhere dense set of points L in ^ and let 
be the class of sets which are complements of the spheres of rational 
radii with centers at points of L. Then 501 being the class of all inter- 
sections of the sets in 5 Rq contains all the closed sets of the space ^ . Next 
for each Se V we consider a random function g{x, co) as defined only for 
xgS and construct a sequence I=I(S) and the sets N{x) = Ns{x) as in- 
dicated in Lemma 3. Let 

J= u I{S), K= U 

SeV SeV 



g{x, oi)=g{x, (o), 

if xel or however, wgN^,x^I then we define g[x, co) in an ar- 

bitrary manner provided only that g{x, co)eA(x, co). Since for the points 
xel the values of the functions g(x, co) and g(x, co) coincide, the sets 
A(x, co) constructed for the functions g(x, co) and g(x, co) also coincide. 
It follows from the definition of g(x, co) that 

g(x, co)eA(x, co) 

for arbitrary x and co. Since {co:g(x, co)y>^g(x, co)}ciN^, P{g(x,co) = 
= g(x, co)} = 1, which completes the proof of the theorem. □ 

Theorem 1 can be directly generalized to the case of random functions 
with values in separable locally compact spaces. 

Theorem 2. Let ^ be a separable locally-compact space and 3C be an ar- 
bitrary metric separable space. For an arbitrary random function g {x, co) 
defined on ^ with values in ^ , there exists a stochastically equivalent sep- 
arable random function g{x, co) taking on values in a certain compact ex- 
tension # of the space ^ . 

The proof follows from the fact that every locally compact separable 
space ^ can be considered as a subset of a certain compact #. For ex- 
ample if g (x, co) is a random function with values in a finite-dimensional 
space ^ , then by adjoining to ^ a single point “at infinity” oo, it is easy 
to obtain a new compact space # = ^ u {oo} with a new metric such that 
every closed set Fez ^ (relative to the topology of space ^) will also be 
closed in # (with respect to the new metric). When constructing a sep- 
arable realization of a random function, it may be necessary to assign to 
this function the additional value oo, but clearly for a fixed x the prob- 
ability of this is zero. □ 
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Stochastic continuity. In many problems it is important to know which 
set I may serve as a separability set. Before answering this question we 
introduce one important notion and present related simple theorems. 

Definition 2. A random function g{x, co) with values in ^ is called stoch- 
astically continuous at point Xq, x^e3C if for any e >0 

as r(x, Xo)^0. (4) 

If g{x, co) is stochastically continuous in every point of a certain set 
then it is called stochastically continuous on B. 

Note that the condition of stochastic continuity is a condition im- 
posed on the “two-dimensional” distributions of a random function, i.e. 
on the joint distribution of the random elements, g{x^, co) and g{x 2 , co), 
Xi,X 2 ^^. In particular this notion is applicable to wide-sense random 
functions. 

The requirement of stochastic continuity at point Xq means that 
C{x)=g{x, co) converges in probability to C(xq) as x-^Xq. 

Definition 3. If there exists a point such that for K-^co 

sup P{e[ 0 (x, (5) 

xeB 

then the random function g{x, co) is called stochastically bounded in B. 

Theorem 3. A random function g{x, co) which is stochastically continuous 
on a compact set ^ is also stochastically bounded on 3C . 

Proof. Let s>0 be an arbitrary number given in advance. For each 
point X we construct a sphere with the center at x, such that 

P{e( 0 (x, («), g{x', co))>l}<| 

for any point x'eS^. From the totality of the spheres we select a se- 
quence , 5 ^ 2 ’ • • • ’ which forms a finite cover of . Then for any y 

Q{g{x,(o),y}^g(g(xi, co), y) + 

+ max Q{g(xi,(o),g(Xi,co)) + Q(g{xj,(o),g{x,co)), 

i = 2, ...,n 

where Xj denotes the center of one of those spheres S^^{k=l,2,...,n) 
in the interior of which the point x is located. The summands in the r.h.s. 
of the equality are finite random variables. Therefore for N sufficiently 
large 



£ 

P{8{ff(xi,o}),y)+ max Q(g(xi, co), g(xi, co))>N}<~. 

i = 2,...,n 2. 
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If we assume that N> I, then for any 

?{Q{g(x, 0 }),y)> 2 N]^?{Q{g{Xj, (o), g{x, ®))> 1 } + 

+ P{e( 0 (xi, co), j;)+ max Q{g{xi, m), g{Xi, o)))>N} <e; 

i = 2, n 

from here it follows that 

supP{e( 0 (x, co), >^)> 2 A'^}<e. □ 

xeSC 

Definition 4. A random function g(x, cd) is called uniformly stochastically 
continuous on if for arbitrary positive ^ and as small as desired, a 
(5 > 0 can be found such that 

P{Q{g{x,co),g(x’,(o))>s}<Ei, ( 6 ) 

as long as r(x, 

Theorem 4. If g{x, co) is stochastically continuous on the compact set 
then g{x, w) is uniformly stochastically continuous. 

Indeed, suppose the assertion is not true, then one can find a pair of 
positive numbers ^ and and for any ^„>0 a pair of points and x'^ 
such that r{x„, x^)<d„ and 

P{q(9{x„, co), g {x'„, co)) > e} > 6i , 

It may be assumed that S„^0 and x„^Xq, then x'„^Xq and 

£i < p {e (fif (x„, co), g (x,;, co)) > e} < p|g (g (x„, co), g (xo, co)) > || + 

+ p|e(fif(:>co, 9 «, ®))>^ 

But this inequality contradicts the condition of stochastic continuity. 

Theorem 5. Let ^ be a separable space, ^ an arbitrary metric space and 
g {x, co) a separable stochastically continuous random function with values 
in ^ . Then any countable everywhere dense set of points in ^ may serve 
as a set of separability of the random function g{x, co). 

Proof. Let F= {5} be a countable set of spheres in 31 introduced above, 
/= {Xfc, A:= 1 , 2 , be the set of separability of the random func- 
tion g (x, co), N be the exceptional set of values co, appearing in the def- 
inition of separability and J be an arbitrary everywhere dense set of 
points in Let B(S, co) denote the closure of the set of values co) 
as the point xj^ runs through the set JnS, and N(S, k) be the event that 
g(x^, co)^B(S, co) provided x^eS. The events N(S, k) have probability 0. 
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Indeed, let x' , r = 1, 2, . . be an arbitrary sequence of points in Jn S 
converging to Xi ^ . Then 



P {g (Xfc, o))^B (5, co)} < P < ]im q {g (x^, co), g (x/, cu)) > 0 > ^ 



< i™, P "I iim. Q {g {xk, (o), g (x;, w)) > ^ [> ^ 
Jim P<je(g(Xfc, co), g{x^, ®))>b=0. 



Let AT' = U U N(S, / c ), then P{N'}=0. If cd^NuN' and ^(x, co)eF for 

S xueS 

all xeJnG, where G is an open set and is a closed set, then for 

every x^gG and for S' such that x^^eSczG, we have 

g{Xk, (o)gB{S, co) ciF. 

It follows from the definition of the set {x^} that g{x, co)eF for all xeG 
and oj^NkjN'. Thus the set J satisfies the condition appearing in the 
definition of the set of separability of a random function. □ 



§3. Measurable Random Functions 

Let and ^ denote metric spaces as before with metrics r(xi, X 2 ) and 
Q{yi, ^2) respectively; let g{x, co) be a random function with values in ^ 
and the domain of definition and let co be an elementary event in the 
probability space {O, S, P}. 

Assume that a cr-algebra of sets 91 is defined on ^ containing Borel 
sets and a certain complete measure g is defined on 91. Denote by 
(t{ 9I X S} the smallest tr-algebra generated in ^ x O by the product of 
(7-algebras 91 and S and by (7 {91 x ®} its completion relative to measure 
JLLXP. 

Definition 1. The random function g{x, co) is called measurable if it is 
measurable with respect to ^{91 x S}. 

Denote by 93 the (7-algebra of Borel sets of the space Recall that in 
the general case it follows from the definition of a random function that 
for any and fixed x 

(co:^(x, (o)gB]g^. 

If, however, the random function g{x, co) is measurable, then 
{(x, co): ^(x, co)eB}Gd{S& x 0} . 
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It follows from here and from Fubini’s theorem that g(x, co), when con- 
sidered as a function of x is 9I-measurable with probability 1 . 

Consider the problem of existence for a given random function of a 
stochastically equivalent measurable and separable function. 

Theorem 1. Let ^ be a complete separable metric space, ^ a separable 
and locally compact space and let measure p be a-finite. If for p-almost 
all X the random function g{x, co) is stochastically continuous, then there 
exists a measurable separable function g*{x, co) which is stochastically 
equivalent to function g{x, co). 

Proof. First assume that ^ and ^ are both compact and p(^)<oo. It 
follows from Theorem 1 of Section 2 that there exists a separable random 
function g(x, co) stochastically equivalent to function g{x, co). Let I be 
the set of separability of the function g{x, co). / is everywhere dense in 
Arrange the points of / in a certain sequence {xj, ^ 2 , ...} and set 
r„ = mm{r(Xk, x,), k, ^-1,..., «}. 

For each n we construct a finite cover of the set dC by the spheres 
whose radius is equal to rJ2 with centers at the points 
It is assumed here that xf^ = Xj for j=\,2,...,n and the other 
points [j=n-\-\,..., m„) are chosen arbitrarily from I, provided the 
spheres [j=n+ 1,..., form a cover of the set Set ^„(x, co) = 
g{xj^, co) if xG*S^”\ k=\, 2,...,n (these spheres do not intersect so that 

j - 1 

the definition is proper) and g„{x, co) = g{x^''\ co) if xe*Sj"^U = 

1 = 1 

= n+\,..., m^ where xf^ is the center of the sphere 

Note that g{x, co) are Borel functions of argument x for a fixed co, 
cr{^xS} are measurable as functions of the pair (x,co). Moreover, 
and 

Qldnix, co), g{x, co)']=Qlg(xt‘\ co), g{x, co)] 

r (1) 

for r(xt‘\x)<^. 

If we let 



G„„(x)=P{(o:Q{g„(x, co), g„+„(x, co)]> e), 

then in view of the condition of the theorem, G„^(x)^0 as ^u-al- 

most for all x. Therefore 

in X P) {(x, co) : Q [g„ {x, co), (x, co)] > «} = J (x) g {dx)^0 

3C 

as n^oo, i.e. sequence ^„(x, co) is fundamental in measure pxP. A sub- 
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sequence co) may be extracted from this sequence which converges 
X P-almost everywhere to some cr x S}-measurable function g{x, co). 

Denote by K the set of points (x, co) on which this convergence does not 
take place. Since K has measure 0, /^-almost all its cross sections have 
P-measure 0. Denote by the set of all those x for which this measure 
is > 0. By virtue of the preceding construction we may assume that 

n/ = 0 and ^(x„, co) = g{x„, co). Let X 2 denote the set of those x for 
which the stochastic continuity does not hold. It follows from (1) that 

P {g{x, (o)^g{x, co)}=0, if x^X^u X 2 . 

We now set g*(x, co) = g{x, co) if {x,co)^K and x^X^kjX 2 and 
^*(x, co) = ^(x, co) if(x, co)eK or xgX^uX 2 - Then P{^(x, co)¥^g{x, co)} = 
= 0 for all X so that g*{x, co) is stochastically equivalent to g{x, co). The 
function g"^{x, co) is a {31 x S}- measurable since it differs from a cr(3l x 
X 0}-measurable function on a set of x P-measure 0. It remains to 
show that ^*(x, co) is separable. Let A(G, co) denote (as in Section 2) the 
closure of the set of values g(x, co) obtained as x runs through the set 
Gn /, and A{x, co) be the intersection of the sets ^(5, co) where S is an 
arbitrary sphere with the center at point x. Separability of the function 
g(x,co) is equivalent to condition g{x, co)eA{x, co). Since g*{x,co) = 
= g{x, co) for xel, the set A{x, co) constructed for ^*(x, co) coincides with 
A{x, co). Next it follows from the definition of g„{x, co) that g*{x, co) = 
= g{x, o)) = lim^„(x, co)eA{x, co)for any x^XiuX 2 and (x, more- 

over g"^ {x,co)==g{x,co)eA (x,co), by definition, for any xeXiuX 2 or 
(x, co)eK. Thus g*{x, co) is a separable random function and the theorem 
is proved in the particular case under consideration. It is now easy to 
obtain the proof in the general case. The requirement of compactness of 
the space is required only in order to be able to refer to Theorem 1 of 
Section 2. Here, however, we may refer to Theorem 2 of Section 2. More- 
over, the separable and measurable representation g*{x, co) of function 
^(x, co) takes on, in general, values on some compact topological exten- 
sion of space Next if ^ is a complete separable space and the measure 

g is c7-finite, then ^ can be represented as a sum of a countable number 
of compacts {X„, n = l, 2, ...} of a finite measure and of a set N of g- 
measure 0. The latter follows from the fact that in a complete separable 
metric space every measurable set A of a, finite measure can be approxi- 
mated in the measure as closely as desired by a compact Kc^A. The fore- 
going arguments are applicable to each one of the compacts From 
here the general assertion of the theorem easily follows. □ 

Remark 1. Theorem 1 holds for Euclidean spaces SC and ^ in particular 
if the measure g is the Lebesgue measure on SC. 
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Remark 2. The proof of Theorem 1 would have been simpler if the sep- 
arability of the measurable representation of the given function was not 
required. In such a case it would not have been necessary to consider the 
set / and the points could have been chosen arbitrarily from the 
corresponding sets. The only property used would have been the com- 
pleteness of the space ^ . Therefore if ^ is complete, is a complete 
and separable metric space and /x is a a-finite measure; then the random 
function ^(x, eo) with values in xe^, coeQ, stochastically continuous 
for ^-almost all x, is stochastically equivalent to a measurable random 
function. 

The next important result follows directly from Fubini’s theorem. 

Theorem 2. Let ^{x)=g (x, co) be a measurable random function taking on 
real or complex values. If 



E I (J(x) I fi{dx)< CO, 



sc 



then for any ^ g 91 

j* E^{x) jii{dx)=E ^ i{x)g{dx). □ 

A A 

The last equality indicates the permutability of the operations of taking 
the mathematical expectation of a random variable and the integration 
with respect to x. 



§4. A Criterion for the Absence of Discontinuities of the Second Kind 

Functions with no discontinuities of the second kind. Let i{t), te\_a, b~\ be 
a random process with values in a complete metric space 

Definition 1. If the sample functions of the process have for each te(a, b) 
with probability 1 left-hand and right-hand limits, and possess at point 
a{b) a right (left)-hand limit, then the process is referred to as without 
discontinuities of the second kind on the interval [a, b). 

In the present section it will always be assumed that the process ^{t) 
is separable. The separability set of the process is denoted by /. 

Definition 2. The function y=f(f), ye^ possesses at least m ^-oscillations 
(^>0) on the interval [a, b~], if there exist points t^,..., t^, a^to<t^< ... 
<t^^b, such that 
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Lemma 1. A function y=f[t) has no discontinuities of the second kind on 
the interval [a, fe] if and only if for any e> 0 it has only a finite number of 
s-oscillations on [a, Z?]. 

Proof. Necessity. Let the number of 8-oscillations be infinite. Then there 
exists a sequence t^, t^,..., tn,..., such that or t^it^ and ^ (/(/„), 
/(/„+i))>8. But this implies that f{tQ — G) or /(/q + 0) do not exist. 

Sufficiency. Let the one-sided limit (say the left-hand one) be nonexistent 
at a certain point /q • Then a sequence t can be found such that for 
any « sup ^(/ (t^), f (/„))> 8, i.e. the number of 8-oscillations is infinite. □ 

m> n 

Note that definition 2 is trivially carried over to the case of random 
functions defined on an arbitrary set of real-valued t. 

Henceforth when dealing with functions with no discontinuities of 
the second kind we shall not distinguish between two functions having 
at each point te\a, Z?] the same left-hand and right-hand limits. There- 
fore it is natural to choose a certain convention concerning the values 
of these functions at the discontinuity point. Denote by D [a, Z?] = 
= D[a, Z?; the space of functions defined on [a, Z?] with values in ^ 
which do not possess discontinuity of the second kind and which are 
continuous from the left or from the right at each point tE[a, Z>]. Set 

^c(/) = sup {min f(t)), Q{f {t"), f {t ))\ ; 

t — c^t'<t<f^t + c,t\t, t''e[a, Z?]} + 

+ sup{^(/(r),/(a)); «</<a + c} + sup{p(/(/),/(Z>)); b-c<t<b}. (1) 

Lemma 2. A function y—f{t) has no discontinuities of the second kind if 
and only if 

limzl,(/)=0. (2) 

c -^0 

Proof. Necessity. It follows from the definition that for any function 
/ eD\a, Z?] the last two terms in the r.h.s. of (1) tend to zero as c-^0. 

Let condition 2 not be satisfied. Then sequences 4, C can be found 
such that t'„<t„<C and e(/(0./(0)>e> ^(/(C),/(0)>e for 

some 8 > 0. It may be assumed that f converges to some /q (if this is not 
the case we replace the sequence t„ by a certain convergent subsequence 
of it). At least two out of three sequences {Q, {/„} and [t'^} possess 
infinitely many points located on one side of /q- If, for example, {t'„} and 
{tf^ are located to the left of t^, then f{tr)-^f{t — ^),f{Q-^f {t — ^) 
which contradicts the condition ^(/ (4), /(!„))>£. The case for which {t^] 
and {r"} possess infinitely many values located to the right of t^ is dealt 
with analogously. All other cases may be reduced to these two. 

Sufficiency. It follows from condition (2) that / {t) is continuous from the 
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right at point a and from the left at point b. If for some tQe{a, ^), /(/q + 0) 
does not exist, then a sequence and 8>0 can be found such that 
^(/(^„),/(/„+l))>^ which contradicts (2). Therefore /(^o + O) exists for 
any tQe[a,b). Analogously the existence of f{to — ^) is obtained. It 
follows from relation (2) that either / (^o) =/ (^o ~ /(^o)=/(^o + ^)- 

The lemma is thus proved. □ 

Some inequalities. Lemma 3. Let ^{t), te[0, T'] be a separable stoch- 
astically continuous process with values in ^ and let there exist a non- 
negative monotonically increasing function g{h) and a function q{C, h)^0, 
h>0 such that 

^{t-h)>Cg{h)\r\{_Q('i{t + h), !^{t))>Cg{h)}^q(C, h) (3) 

and 

00 00 

G= X 0(r2-")<(», 2(C)=X 2"^(C, 7’2-")<o). (4) 

n=0 n=l 



Then 

P{ sup Q(i{t’),^{t"))>N}^ 

t',t"e[0,T] 

«p{<.({(o).£(r»>^}+e(^). viv>o^ 

Proof. Put 

/c = 0, 1,2,...,2"-1, n = 0, 1,2,... 

00 2 ^- 1 

^nk — ^nk - = H Pi Bfnk 

m = n 1 

^0 “"^00 . 

In view of stochastic continuity, the separability set J of process ^ (t) 
can be assumed to be the set of numbers of form /c/2", /c = 0, 1, 2 ,..., 
n = 0, 1, 2,... (cf Theorem 5, Section 2). We have 

00 2 "*— 1 00 

I X 2'”q(C,T2-'») = e(n,C), (5) 



Q(n,C)= E T'q{C,T2-”'). 



where 
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It follows from Dq that q{^{T), ^{0))^Cg{T) and that one of the fol- 
lowing events takes place: either q{^{T/ 2), i{0))^Cg{T2~^) or q{^{T), 
i{T/2))^Cg{T2-^). In both cases 

e(<^(0), aT/2)HCg{T)+Cg(T2-^), 

QiaVl), ^iT))^Cg{T) + Cg{T2-^). 

We now apply the induction method. Assume that inequality 



^(<|i^)^(^r))<Q(T) + 2CE,(|) (6) 



is proved for m = n and for kj = 0, 1 ,... 2" under the assumption that Dq 
is valid. We prove that an analogous inequality holds also for m = 1. 

Let k and j be odd numbers k = 2k^ + l,j = 2ji — 1. Since it follows from 
+ i that at least one of the inequalities 



, /k, \ f2k.+\ 



^Cg 



)n+ 1 



e{ ^C^T\i(^^T]]^Cg 



)« + 1 



is satisfied, we obtain that 



Q\ f 



yn+ 1 



A' 

d-T 
V2" 






where k' is either equal to k^ or to /c^ H- 1. Analogously an integer / can 
be found such that 



Taking the induction assumption into account we obtain 

Tj^^Cg{T) + 2C g(T2~^). 

The case when k or j are even is dealt with analogously. Therefore in- 
equality (6) is proved for all 1. It follows from the separability of the 
process that if event Dq occurs, then 

sup{p((^(r'), t', t"e[0, r]}<2CG 
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with probability 1. From here it follows that 
sup e(^(t')> ^{t"))>N}^ 

The lemma is thus proved. □ 

Lemma 4. Let the conditions of the previous lemma be satisfied. Then 

p|4(a>cG(^ \gf ,cj, 

where 

00 00 

G{n)= I g{T2-^), Q{n, C)= ^ T2~^)^ 



Proof We continue the argument of the previous lemma. Let event D„ 
occur. Using induction we shall prove that for any k and m an integer 
jnm Can bc found such that 




n + m 

X g{T2-^). (8) 

s = n 



Moreover quantity regarded as a function of m (for fixed n and 

k) is monotonically non-decreasing. For m = 0 we choose j„o=0 if 



/ fk\ /k+l\\ { (k-\\ (k\\ 

jj<Q(T2-") and j„, = \ if 



^ Cg{T2 ”). Under the assumption of occurrence of D„, one of these in- 
equalities will necessarily hold. Assume j^m has been chosen. We then 
define if 




^7/tm + 1 

2« + m + 1 
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andj„„+i=2;„„ + l,if 



e ^ 



fc-1 j„m 
2" ”^2"+" 



T],i 



'k—l 2j„^ + l 
2« 2” 1 



TJ U 



Such a choice of 7 „^+i is possible since one of these two inequalities 
necessarily holds if occurs; in the case when both of these inequalities 
are satisfied the choice between the above stated values of j„m+i is 
arbitrary. 

Approaching to the limit in (7) and (8) as m oo we obtain that for 
any sample function for which D„ holds a number t = t(o)), 

^ can be found such that 



and 






sup Q 

r<f<r2-("- 1) 
teJ 






d — T 
' 2 " 



^CG(n). 



Let se[2 2 7] and 0<t" — t' <s. Then a k can be found such 

that (/c-l)2"”T^t'<r<(/c+l)2""7: If then either {t\t)e 

e[(/c- 1) 2""T {k- 1) 2-"T+t] or (t, t")cz [(/c- 1) 2 ""T+t, (/c+ 1) 2-"T]. 
If, moreover, the values t\ t, and t" are chosen from J, then at least one 
of the inequalities 



Qiat% m^2CG{n), anH2CG{n) 

is satisfied. 

It follows from the separability of the process that one of these in- 
equalities holds with probability 1 for any sample function of the process. 
Therefore it follows from D„ that with probability 1 



d,(a^2CG(n). 

In view of inequality (5) 

P{A,(0>2CG{n)}^P(D„)^Q(n, C), 

or taking into account that 6 ^ 2 " and the monotonicity of functions 

g{h) and q{h) we finally obtain 



PU,{^)>CG 



1 ^ 
lg2:r- 
2e 



ig2 



2s 



C . 



The lemma is thus proved. □ 
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The conditions based on marginal distributions of the process for absence 
of discontinuities of the second kind. 

From the preceding lemma the following is immediately obtained. 

Theorem 1. If ^(^), ^g[0, T] is a separable stochastically continuous 
process with values in ^ satisfying conditions 

P{[e(<^W> i{t-h))^Cg(h)\nlQ{^{t-^h), ^{t))^Cg{h)'\}^q{C, h), (9) 

where 

f 0(r2-")<a., f 2"^(C, r2-")<cx), (10) 

n=l «=1 

then with probability 1 ^[t) has no discontinuities of the second kind. If 
moreover ^ 

Q{n, C)= X 2^q{C, T2~^)^0 (11) 

m = n 

for some n and C oo then for each sample function of the process with 
probability 1 a constant a can be found such that 







for 0<£<8 q, 



where 



00 



G(n)= E g(T2-% 

m — n 



Proof Setting C = 1 into inequality from Lemma 4 we observe that under 
the conditions of the theorem dg((^) 0 in probability as e -> 0. But 

regarded as a function of s is monotonically decreasing as Therefore 
limde(^) as 8-^0 exists with probability 1 and equals 0. This proves the 
first assertion of the lemma. The second assertion also follows from 
condition (11) and Lemma 4. □ 

As a particular case of Theorem 1 consider a separable stochastically 
continuous random process satisfying condition 



E[^(^(t + /i), m Qi^ (12) 

where p>0 and r>0. Substituting g{h) = h'''^^^ and utilizing Chebyshev’s 
inequality we observe that relations (9), (10) and (11) are satisfied for 

q{C, = and 0<r'<r. We thus obtain 



Corollary 1 . If a separable stochastically continuous random process 
satisfies condition (12) then its sample functions satisfy with probability 1 
relation 



where ol — ol{(d) is a constant and r' is an arbitrary number from (0, r). 
We also state the following corollary of Theorem 1. 
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Corollary 2. Let a wide-sense stochastically continuous random process be 
defined on [0, T] with values in a complete separable locally compact 
space ^ whose ''three-dimensional'' marginal distributions satisfy condi- 
tions (9) and (10). Then there exists a representation of this process without 
discontinuities of the second kind. 

The condition based on conditional probabUities for absence of discontin> 
uities of the second kind. 

In the previous theorem the condition for absence of discontinuities of 
the second kind was expressed in terms of properties of marginal (“three- 
dimensional”) distributions of a random process. 

We now present results of a somewhat different nature. They utilize 
the assumptions related to conditional probabilities and are applicable 
in the case when information on the properties of conditional distri- 
butions of the process is available. 

Let ^e[0, 7]} be a current of (7-algebras. We shall say that the 
process ^{t) obeys (or is subordinated to) the current of a-algebras 
{5r, ^e[0, 7]} if for each /g[ 0, 7] the random element ^{t) is g^-mea- 
surable. 

We introduce quantity 

a(6, ^) = inf sup[P{^((^( 5 '), ^(t))>£ \ gj; + 7, cog^2'], (13) 

where the inf is taken over all the subsets Q' {Q' g ®) which have proba- 
bility 1. It is easy to verify that there exists a such that P(^2^)=l, 
G S and such that the inf is attained on this set, so that 

a(£, d) = sup{P {Q{i(s% i(t))>e | gj; 7, coeQ^}. 

We show that the condition that a{s, (3)-^0 as 3^0 and any £>0 
assures the absence of discontinuities of the second kind for separable 
processes. Let [c, (/] be a fixed interval, [c, (/]cz[0, 7] and / be an 
arbitrary finite sequence of instants of time t^, ^ 2 ? •••? <^2 • 

<t„^d. Denote by A (s, Z) the event : the sample function of a random 
process ^[t) on [c, J] n Z has at least one £-oscillation. 

Lemma 5. With probability 1 

(14) 

Proof We first note that since for s<t, it follows from the 

properties of conditional mathematical expectation that 

= E {P{e(i^(t), I ,} 1 5 J ^a(e, u-s). (15) 



for s<t<u. 
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We now introduce events 

i=l,2,.:,k-lQiacUiQ)>"^, 

Q = |e(^(a D, = B,nC„ k=l, 2 ,..., n, 

Co=^emm)>^j- 

n 

Events are disjoint and if we put D= \J then A{s, I)czCquD. 

k= 1 

g 

Indeed if A{e,1) holds then for some k inequality ^ ((J (c), (t^)) ^ - is 
satisfied for the first time, i.e. one of the events n) occurs. If, 

g 

moreover D does not hold, i.e. if Q{^{tk), ^{d))<-, then q{^{c\ ^{d))^ 

g 

^q{^{c), ^( 4 )) — ^((^( 4 ), i{d))>-, i.e. the event Cq occurs. Thus T(e, /)cz 
czCquD. We now have with probability 1 

P {D, I 5 J = E {zo, I = E {E {xb^c. I \B,} = 

= E{Zb.P{Q| I E{XbJ 

where Xa denotes, as usual, the indicator of event A. From here it follows 
that 

P{£>|5J=^I P{D,| 5j^a(^^,rf-c)E|t^XBj 5.}^ 

d—c^ (mod P). 

In view of (15) P(Co | d — ^. Therefore 

P {T (8, /) I g J < P {D I 5,} + P { Co I g J < 2a(^^, d - c) (mod P) , 



which proves the lemma. □ 
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Lemma 6. Let A^(s, /) denote the event: ^{t) has at least k E-oscillations 
on I. Then 

/) I — (modP). 

Proof. Let B^{e, I) denote the event: the sample function of process ^{t) 
possesses at least k—1 8-oscillations on the set (ti, l), but the number 
of 8-oscillations on the set (ti,..., is less than k—1. The events 

n 

(8, /) (r= 1, n) are disjoint and U jB^(8, /) = >1^“^(8, /)=>^^(8, /). On 

r= 1 

the other hand, it follows from A^{s, I) c: B^{e, I) that at least one 8-oscilla- 
tion exists on the set ( 4 , t„). Consequently, 

n 

A% /)<= U (B.(e, I)nC,{s, /)), 

r= 1 

where (8, 1) denotes the event that ^ (t) has at least one 8-oscillation on 
t„). Therefore 

I) I i: P{B,(e, I)nC,(s, /) | gj (modP). (16) 

r= 1 

Using the properties of conditional mathematical expectations we obtain 

P{B,{s,I)nC,{s, I) I ds} = P{^{XBA.,i)XcA..n I I 

P{5,(8,/)| gj (modP). 

It follows from the inequality obtained and from (16) that 

P{A'‘(s,I)\ 'S,}^2a(^,d-c^ X P{Br(e,I)\^s} = 

= 2a|^^, d-c^ P{A'‘~^{e, /) | g^} (modP), 

which yields the required assertion. □ 

Theorem 2, If ^ (t) is a separable process and for any 8 > 0 

lima(8, (5) = 0, (17) 

d-^O 

then process ^ (t) has no discontinuities of the second kind. 

It is sufficient to prove that with probability 1 every sample function of 
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^{t) possesses only a finite number of a-oscillations. Let J be the separa- 

00 

bility set of process ^{t). We represent this set as J= U In^ where {/„} is 

«= 1 

a monotonically increasing sequence of sets consisting of a finite number 
of elements. Let 8>0 be given. Subdivide [0, T] into m intervals r= 1, 
. . ., m, of equal lengths such that 

V4 my 



Then 



P {A^ (e, JnA,)\ < P / n zl,) | SJ = 



hence 

Jnd^) I gj=0 (modP) and P{T°®(8, Jnd^)}=0. 

Consequently, P (8, J)} = 0. The theorem is thus proved. □ 

We present a number of important corollaries of the theorem just 
proved. 

Theorem 3. A separable stochastically continuous process ^{t)tG[0, T] 
with independent increments and with values in a linear normed space ^ has 
no discontinuities of the second kind. 

Indeed, we have from the definition of a process with independent 
increments 

I a}=p{ia^)-<^wi>4 (modP). 

On the other hand, it follows from the property of uniform stochastic 
continuity (c.f. Theorem 4, Section 2) that 

a(8, 4 = sup[P{|(J(5)-<^(f)|^£}; 0<5<f<5 + (5^7’] 

tends to zero as d^O for any 8 > 0. Therefore the conditions of Theorem 2 
are satisfied. □ 

Theorem 2 implies some sharp results for Markov processes also. 

Theorem 4. If^ [t) te\f),T^isa separable Markov process with values in a 
metric space ^ and transition function P {t, x, s, A) satisfying condition 

a(8, ^) = sup[P{5', y, t, Sfy)};ye^, 0^s^t^s-\-5^T^-^0 

as 3-^0, where S^{y) is a sphere of radius a with center at point y and 
is its complement, then the process ^ (/) has no discontinuities of the second 
kind. 
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The last assertion follows directly from Theorem 2 and the definition 
of a Markov process. □ 

Regularization of sample functions of a process without discontinuities of 
the second kind. As was mentioned previously, when considering func- 
tions without discontinuities of the second kind we identify functions 
which have the same right-hand and left-hand limits at each point. 

Recall that if the process is separable then the values of the sample 
functions of f (/) with probability 1 are limiting values of the sequences 
^ {ti) for ti t and ti belong to the separability set. If, moreover, the process 
does not have discontinuities of the second kind then, with probability 1 , 
i (r) is equal io — 0) or 0) for each t. 

Theorem 5 . If ^ (/) is a stochastically continuous process without discon- 
tinuities of the second kind and with values in a metric space ^ , then there 
exists an equivalent process ^'{t) whose sample functions are continuous 
from the right (mod P). 

Proof Define event A\ lim t-h- ) exists for each tG[0. T) with proba- 

n-*oo V nj 

bility 1. Set lim q t-h-) in the case of A and = in the 

«->oo \ nJ 

case of A. We have 






P{i'{t)^^{t)}= lim P 

- ' ' m 



nA>. 



On the other hand 






= lim pj nlgliit), q t+-) p-l'K 
ln=kl \ \ nJJ mj ' 



^ lim i{t), ^ + 



Therefore P{^'(t)/^(t)}=0. It remains to observe that the function ^'{t) 
is continuous from the right on A. The theorem is thus proved. □ 
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The existence of a stochastically equivalent process continuous from 
the left is proved analogously. 

Martingales. Consider properties of sample functions of a separable 
semimartingale {(^(/), ^c[0, T]}. The general definition of semi- 

martingales and martingales was given earlier (Chapter II, Section 2). In 
that Chapter we obtained important properties of semimartingales with 
discrete arguments. Note that the inequalities obtained in Section 2 of 
Chapter II can easily be carried over to the case of separable submartin- 
gales. Indeed for a separable process the event sup te[0, T]}# 
/sup{^(/), tel] where I is the separability set of processes ^{i) has 
probability 0. Therefore the corresponding random variables have the 
same distribution. Furthermore, if the sample function of the process 
^{t) on the interval [0, T] upcrosses the half-interval [a, b) n times and 
where N is the exceptional set appearing in the definition of 
separability, then ^(t), restricted to /, also upcrosses the half-interval 
[a, b) n times. This means that the distribution of variables v^q, t] 
and Vj \_a, b) is the same (the lower index at v denotes the set to which 
the process is restricted). 

Theorem 6. A separable semimartingale on [0, T] has no discontinuities 
of the second kind. 

To prove this theorem one should basically repeat the argument 
given in the proof of Theorem 1, Section 2, Chapter II. It follows from 
inequality P{sup{(^"^ (/), /e[0, T]} > C} ^ (T) that sup{f(/), 

^ 6 [o, r]} < 00 with probability 1. Analogously it follows from inequality 
Ev\_a, b)^{b — a)~^ E{^{T) — by as a^ — co that with probability 1 
inf{^(t), te[0, T]} > — 00. Furthermore since v\_a, b) is integrable there 
exists a set iViGS with F{Ni)=0 such that for co^N, i{t) crosses any half 
interval [a, b) only a finite number of times and consequently (^ (^) has no 
discontinuities of the second kind. We may choose to be the sum of all 
N{a, b) where N{a, b) is the set on which v\a, = oo when a and b run 
through all the rational numbers and a<b. The theorem is thus proved. □ 

Let denote the minimal a-algebra containing iox s<t and 
+ 0 the intersection of for s>t. Clearly - o ^ + o ^ ~ 

is an measurable variable, while c^(t + 0) is a measurable 
variable. 

Theorem 7. Let {^{t), 5^ /e[0, 7]} be a separable submartingale. Then 
{(^(t + 0), + F]} {^{T-\-0) = ^{T)) is also a submartingale whose 

sample functions are continuous from the right with probability 1. More- 
over, P{(^(^) = ^(/ + 0)} = 1 at each point at which E^{t) is continuous and 
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Proof. Note that max(<^(t), a) is a submartingale (Chapter II, Section 2) 
and moreover a uniformly integrable family of random variable (Chapter 
II, Section 2). Therefore, for s^t 



n.ax(4(.), a).P<nm I max({(P), a) = | max(«(,+0), a) .P 



for any 5^. Approaching — oo we obtain 



^{s)dP^ 



i{t + 0)dP, 



A A 



i.e. 

^(s)<E{^(f + 0)|a} for s^f. 

Hence 

^(s+0)<E{^(r+0)|g,^o}, 

This proves the first part of the theorem. It is easy to verify that the 
previous discussions also yield the following inequalities 

at)<E{.^(r + 0)|gJ<E{^(t')|a}(modP) for t'>t, 

which implies that 

i^(t)^(^(t+0)^E{(J(t') I gj (modP). 

at point t such that "St = "St+o- 

Now if E(^(r')->E(f(t), then (^(t + 0) = (^(t) (modP). It is also easy to 
verify that the functions ^(^ + 0) are continuous from the right. The theo- 
rem is thus proved. □ 



§ 5. Continuous Processes 

Conditions for continuity of processes without discontinuities of the 
second kind. 

We shall assume here as before that ^ is a complete metric space, and 
^g[ 0, r] is a random process with values in 

Definition 1. The process ^{t), te[0, T\ is called continuous if almost all 
of its sample functions are continuous on [0, T~\. 

For processes without discontinuities of the second kind one can 
formulate a rather simple sufficient condition for continuity. 

Theorem 1. Let /: = 0, 1,..., m„}, « = 1, 2, ..., be a sequence of sub‘ 
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divisions of the interval [0, T~\, 0 — t„Q<tni< ... <t„^^ = T and X„ = 
max {t„k — tnk- 1)-^0 as n-^ CO. If the separable process ^ (^) has no dis- 

1 < fc ^ m„ 

continuities of the second kind, then condition 

Z as n-^co (1) 

k= 1 

for any v>^isa sufficient condition for continuity of this process. 

Proof. Denote by Vg(O^Vg^oo) the number of values t such that 
^[(^(/ + 0), (^(r — 0)]>2^, and by the number of indices k such that 
^(^nfc-i)]>£* Clearly Vg^lim On the other hand 

n-*oo 



Evf = Z 

k= 1 

In view of the Fatou lemma Ev,^E lim lim Hence Evg = 0, 

n~* cc n-*oo 

i.e. Vg = 0 with probability 1 for any a>0. Consequently ^{t-0) = i{t-\-0) 
for any t with probability 1. In view of the separability of the process 
= — 0) = ^(/ + 0), i.e. the process is continuous. □ 

Corollary. If {^{t), ^e[0, F]} is a separable semi-martingale and 



mn 

Z P{iaU-^(?..-i)i>e}-0 as 

k= 1 

then ^[t) is a continuous process. 

This corollary follows from the fact that a separable semi-martingale 
has no discontinuities of the second kind. □ 

We now apply Theorem 1 to the processes satisfying the conditions 
of Theorem 2 in Section 4. Let ol (&, be determined by relation (13) of 
Section 4. 



Theorem 2. If process ^(i) is separable and 



lim 

8^0 



Ea(^, (3) 



(2) 



for any e>0, then process ^{t) is continuous. 

Proof. Since process ^{t) has no discontinuities of the second kind 
provided condition (2) is satisfied it is sufficient to verify relation (1). 
Noting that 



p {e a {tr,k), (ink - 1 )] > e} < a (6, zl tj , where A t„^ = t„^ -U-i, 
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we obtain that 

fnt(F At \ 

Z )]>£}<(*-«) max ’ ^0 

fc=l \^k^n Atnk 

as -> 0 . The theorem is thus proved. □ 

Applying Theorem 2 to Markov processes we obtain the following 
condition for continuity of a Markov process : 

Theorem 3. Let ^{t) be a separable Markov process and let 

^P(i, y, 5 ,(y ))->0 

for (5 0 and for any fixed 8 > 0 uniformly in y, s and t where ()^t — s^ 6 , 

then the process ^{t) is continuous. 

Here S^(x) denotes the complement of the sphere *S£(x) with the center 
at point X and radius s. 

Processes with independent increments. The theorem just proved gives 
only sufficient conditions for continuity of a random process. It turns 
out however that for the particular case of processes with independent 
increments the conditions of Theorem 1 are also necessary. 

Theorem 4. If the process (t) with independent increments is continuous, 
then condition (1) is satisfied for an arbitrary sequence /r = 0 , ..., 
«=1,2, ... of subdivisions of the interval [0, T] for which 2„ = 
= max 

l^k^ntn 

Proof. Set sup ^(^2)]* view of the continuity of the 

\ti-t2\^h 

process ^{t), A „-^0 for h -^0 with probability 1 . Therefore 
lim P{d;,>£}= 0 . On the other hand if 2 „</z, then 

h^O 

P{A^>e}^P {sup0 [i {tj, i (t„k_ 1)] > s} = P {p [<^ ^ (t„o)] > e} + 

+ P{eK(t„2), <^(f„i)] >£} + ... 

Wn- 1 

k=l 

^ P{eK(U^t».-i)]>£}; 

mn PIJ 

hence ^ P{pK(fJ, as /z^0andany8>0. 

k=l ' \Ah ^ 

The theorem is thus proved. □ 
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It is now easy to present a complete description of continuous pro- 
cesses with independent increments and values in a finite-dimensional 
space. 

Theorem 5. A random process ^{t), t^O, (^(0) = 0 with values in and 
with independent increments is continuous if and only if ^{t) is a Gaussian 
process with continuous mean value a{t) and continuous matrix correlation 
function R{t, 5') = (min (/, s)), where a^{t) is a matrix function cr^(0) = 0 
and (^) — (^) for s<t is a non-negative definite matrix. 

Proof Let ^{t) be a continuous process with independent increments. 
We prove that i{t) — ^{s), {s<t) has a normal distribution. Choose an 
arbitrary vector z, The scalar process rj{t) = {z, ^(t)) is also a con- 

tinuous process with independent increments. If we show that rj{t) — rj(s) 
has a normal distribution, then it will follow that — (^(s) has an m- 
dimensional normal distribution. Let /c = 1, ..., m„ be the subdivision 
of the interval (s, t) into intervals of equal length such that (cf. Theorem 4) 

mn C 1)1 

Z Pw{tnk)-1l(tnk-l)\>-\<-- (3) 

it=i (, n) n 

Set = = if and other- 

n 

wise, and let rj'n — Yj ^'nk- H follows from inequality (3) that P {rj'„ ¥^rj{t) — rj{s)} 

k 

<-; hence the P-l\mri'„ = r]{t)-ri(s). Let = a'„ = Y.^nk< 

^ k 

<^n=Z'^nk- Consider the following two cases : 

k 

1) limcr^ < 00 ; 2) limd^ = oo. In the first case there exists a subsequence 
such that lim = cr^ < oo . Since 

^ttr ^ {^rirk ? 

k 



the central limit theorem is applicable to the sum in the r.h.s. of the last 
equation. The distribution of rj'„^ thus converges weakly to the normal 
distribution with parameters (0, g^). Since converges in probability 
to a limit, should also converge to a certain limit a. Thus rj{t) — rj{s) = 
= a-\-f], where rj is a. Gaussian random variable. 

In the second case for any oO a q„ can be found such that 



<ln 

I 

/C= 1 



E This follows from the fact that the quantities gI^ are uni- 



formly small gIj,<— . The central limit theorem is applicable to the 
n 
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Qn 

sum Yj {flnk~^nk)’ But then it follows from the equation 
1 



rrin 

iurjn _ ^ iua'n (rink ~ Onk) 

k= 1 



that 



limlE^*""^”!^!™ 






n 

Ir= 1 



^giM(»7nk-«nk) 



_^-(u2c2/2)^ 



where c is an arbitrary number. Therefore limEe*“''” = 0, which contra- 
dicts the convergence of to rj{t) — rj{s). Hence the second case can not 

hold. Thus we have shown that ^(r) — (^(s) has a normal distribution. 
Let a{t)= E(^(r), (r^(t)= E{^(t) — a{t)) — If t>s, we have for the 

matrix correlation function 



J? (t, s) = E (t) - a (t)) (s) - a (s))* = (s) . 

It follows from the continuity of {(t) that the characteristic function 
J{u, t)= = 

is continuous in t. This is possible if and only if a{t) and cr^{t) are con- 
tinuous functions in t and a^{t) satisfies the conditions of the theorem. 
The first part of the theorem is thus proved. 

Now let ^{t) be a Gaussian process with mean a{t) and matrix cor- 
relation R(t, s) = (T^{mm{t, s)), where a{t) and a^{t) are continuous. Set 
— — Then if <t2<t3<4, 

= R[t4,, t2) — R{t^, t2) — R{t4, ti)-{-R[t2, tl) = 

= (h) - (^2) + (ti) =0, 

i.e. the process ^ (t) has independent increments. Next 

and from the well-known expression for the moments of a Gaussian 
distribution we have 



^W{h) - (t or = 3 [Sp (tO - (fi)}]^ 



Utilizing Chebyshev’s inequality we obtain 

m„ 

k=l 

3[Sp{cr^(tJ-<r^(f„fc_i)}]^^ 

< 2. A 
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< 4 Sp<7 (T)^0 

8 

as mdx(t„k — tnk- 1 ) 0. In view of Theorem 1 the process (t) and there- 

fore the process {(t) are continuous. The theorem is thus proved. □ 

Kolmogorov’s conditions for continuity of a random process. We prove a 
convenient and direct sufficient condition for continuity of a random 
process, which does not utilize the assumption of absence of disconti- 
nuities of the second kind. This condition is based on the simplified 
version of Lemmas 3 and 4 in Section 4. 



Lemma 1. Let ^ (/), ^ e [0, T]dea separable process satisfying the following 
condition: there exists a non-negative monotonically non-decreasing 
function g{h) and a function q{c, h), h'^0 such that 

P{Q{i(t + h), m>Cg{h)}^q{C, h) (4) 

and 

00 00 

G=X 3(2“"T)<oo, Q(C)=Y 2”q(C, 2- "T)< CO. (5) 

n =0 «=1 



Then 



P{ sup 




and 



P 



sup Qiat'),m)>cG 



T~ 




T 






lg2y- 
L J 



( 6 ) 

( 7 ) 



where 



00 



G(m)= X g{2-"T), 

n = m 



Q(m,C)= X 2"q(C,2-"T). 



( 8 ) 



To prove this lemma it is sufficient to repeat in simplified form the 
arguments presented in the proofs of Lemmas 3 and 4 of Section 4. We 
shall omit the details and present a brief outline of the argument. We 
introduce the events 



r),{(Ar))sQ(2-T)|, 

fc = 0, 1 ,..., 2”— 1 , n = 0 , 1, 2 ,... 

00 2” — 1 

and set D„= H Pi Then 
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It follows from that for any f and t" belonging to J (as defined in 
Lemma 3, Section 4) 

if moreover, in addition to D„ the inequalities 0 2"" are satisfied, 

then ^{t"))^2CG(n). Arguing in the same manner as at the con- 

clusion of the proof of the above-mentioned lemmas we obtain the re- 
quired assertion. One should also keep in mind that conditions (4) and 
(5) yield stochastic continuity of the process ^{t). □ 

Theorem 6. Let the conditions of Lemma 1 be satisfied. Then the process 
^{t) is continuous. If, moreover, Q{m, C)^0 for some m and C-^co, the 
process i^{t) then possesses the following property: with probability 1 there 
exists a constant 7 = y(o) such that 



sup (^(f"))< 7 G 






lg2 



2e 



(mod P). 



( 9 ) 



The theorem follows from Lemma 1. □ 

As a particular case in which conditions (4) and (5) are fulfilled we 
consider the process satisfying 



where p>0, r>0. Set g{h) = h'''^^, where 0<r'<r. Then 



( 10 ) 



G( Ig. 



2 £ 



and G(|lg 2 ^ 






where and K 2 are constants. From Theorem 6 now follows 

Corollary 1. If a separable random process ^{t) satisfies condition (10) 
then its sample function satisfies with probability one a Lipschitz condition 
Q [f (/'), ^ (/")] ^y\t" — where y = y{co) is a constant and r' can be any 
number in (0, r). 

Corollary 2. Consider the Wiener process for which £[f{ty-h) — ^ (/)] = 0, 
E[^{t + h)-^{t)f==h. Since E |(^(r + /i)-^(/)P"* = (2m- 1)! !| /?r an 
arbitrary integer m, the sample function of the separable Wiener process 
satisfy with probability 1 a Lipschitz condition of order ^ — s, where a is an 
arbitrary positive number. 

Corollary 3. If a separable process satisfies conditions (4) and (5) and a 
q^G[m)^K for all m and some q>\, then the sample functions of this 
process satisfy with probability 1 a Lipschitz condition 
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Consider now another condition more general than (10) which as- 
sures the fulfillment of assumptions (4) and (5). Let 






L\h\ 



iig2i/»ir 



P<^ 



If we set 



then 



gr(^) = |lg 2 |/!|| where p<r'<r, 

G= t |ig2l2-”r|i 

n=0 

LT 



Q{CH- 



-<00 . 



X C'’|ig2|2-"rir 

«=0 



( 11 ) 



Corollary 4. If a separable process ^{t) satisfies relation (\\) then the 
process is continuous. 

Gaussian processes. We now apply the preceding results to a one-dimen- 
sional separable real Gaussian process ^(r), re[0, T] with the correla- 
tion function R{s, t) and the mean value 0. The difference l^{t-\-h) — ^{t) 
has the variance 

(T^ (/, H) = R{t -\~h.^ t-\-H) — 'lR{t, t H) R{t i). 

Therefore oo 

Pmt + h)-m>Cg(h)] = f= f 

where a = Cg{h) h). Utilizing inequality 



00 

J a 



(which is easily verified using integration by parts), we obtain 

P{l^(t+/i)-at)l>Cg (k)} ^ • 

Theorem 1. If a Gaussian process satisfies condition 

K 



a^(t, 



Inlhll” 



p>3. 



( 12 ) 



(13) 



(14) 



then the process is continuous. 
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Proof, Set ^(/z)=|ln|/il| ^ , where p' is an arbitrary number satisfying 



inequality \<p' < 



P-1 

2 



. We then can choose 



K' 



^-(C2/2X)|ln|fillP-2p'^ 



and series (5) will be convergent. From here the assertion of the theorem 
follows. □ 

Moreover, it follows from the second part of Theorem 6 that for each 
sample function of the process with probability 1 a constant y = y{co) can 
be found such that 



|(^(t + /i)-<^(t)Ky In- 



1 -p' 



If we now assume that the correlation function of the process ^{t) is 
smoother, then the sample functions will also be smoother. Assume that 



h)^K\h\P, p>0. 



(15) 



It follows from (13) that q{C, h) can be chosen as 



a(C -C^gHh)!2K\h\P 

^ ^ Cg{h) 

^Qig[h)—\h\^^^ |ln|/z||^‘‘'^ where a>0. Then 

1 



00 ^ 

Q{m,C)= X ^ 



-(C2/2X')|n-lnT|2+2e + „in2 



c |«-inr|'+^ 

tends to zero as C-^oo. Next we have 



We have thus obtained: 

Theorem 8. If the correlation function of a Gaussian process satisfies (15) 
then its sample functions satisfy with probability 1 the following inequality 

where s is an arbitrary positive number and y is a constant. 

In particular, sample functions of a separable Wiener process satisfy 
with probability 1 the following inequality 

t, r+/i€[0, T] 

for any 8>0. This result represents a refinement of Corollary 2 of Theo- 
rem 6. 




Chapter IV 



Linear Theory of Random Processes 



§ 1. Correlation Functions 

Positive definite kernels. The existence of an important and sufficiently 
wide class of problems whose solution requires a knowledge of only the 
very general properties of a random function and its first and second 
moment is indeed remarkable and non-trivial. A substantial part of the 
theory of linear transformations of random functions is devoted to these 
problems and they constitute the main topic of the present chapter. We 
shall therefore discuss here random functions with values in a linear 
space with finite moments of second order unless otherwise explicitly 
stipulated. 

Let CW, xeXbc 2i complex-valued random function with finite mo- 
ments of the second order. We call these random functions Hilbert ran- 
dom functions. A Hilbert random function can be interpreted as a 
function defined on X with values in the Hilbert space of random vari- 
ables J^2 • 

In particular if X is an interval (a, b) on the real line and C W is a curve 
in J^2» then the notation l^ = l^(x\ xe{a, b) is a parametric equation of 
this curve. In the present chapter mainly Hilbert random functions are 
considered, therefore the word Hilbert will often be omitted. 

Set 



fl(x)=EC(x), 

R{x, j)= E(C(x)-a(x)) (C(>')-a(j)). (1) 

The function a{x) is called the mean value of ^{x) and R{x, y) is its corre- 
lation function. If we putx=y then R{x, x)= E|C(x) — a(x)|^ = (T^(x) gives 
us the variance of the complex- valued random variable CW* The corre- 
lation function coincides with the previously introduced covariance of 
the set of random variables C(x) — a(x). 
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It is sometimes more advantageous to utilize the correlation function 
in place of the covariance since the correlation function has an important 
probabilistic interpretation - namely, it characterizes the degree of linear 
dependence between the values of a random function at two points. 

On the other hand, the distinction between correlation function and 
covariance is inconsequential. While the correlation function is the co- 
variance of the random function ^{x) — a{x\ the covariance, on the other 
hand, can be interpreted as the correlation function of where 

(p is a random variable uniformly distributed on ( — tt, n) independent of 
(C(x), xeX). This means that the classes of correlation and covariance 
functions coincide. Henceforth we shall consider correlation as well as 
covariance functions. 

The covariance B{x^, x^=E^{x^)^{x^ of a random function C(x) 
possesses the characteristic property of positive definiteness : Let X be 
an arbitrary set. 

Definition 1. A complex-valued function C(x^, X 2 ) (xj, X 2 )eX^ is called 
a positive definite kernel on X^ if for any w(«= 1, 2, ...), x^^eX and any 
complex numbers z^{k=\,2,...) 



X C{x^, X,) . 



( 2 ) 



Covariance J5(xi, X 2 ) is a positive definite kernel on X^, Indeed 



E B(x*,x,)ztZ,= E 

fc,r=l 






>0. 



The following properties of a positive definite kernel are easily derived 
from its definition 



1) C(x,x)^0, (3) 

2) C(xi, X2) = C(X2, Xi), (4) 

3) |C(xi, X2)|^^C(xi, Xi) C(x2, X 2 ), (5) 

4) |C(Xi, X3)-C(X2,X3)P^ 

:^C(x 3 , X3) [C(xi, Xi)-hC(x2, X2) — 2 ReC(xi, X2)]. (6) 



These properties may be easily verified directly for covariance. To 
obtain inequalities (3)-(6) in the general case we first put n = 1 into (2). We 
thus obtain C(xi, x^) |zip ^0, which yields (3). Next we assume n = 2. We 
first note that C(xi, X 2 ) ZiZ 2 + C(x 2 , x^) Z 1 Z 2 is real which implies (4). 
Inequality (5) is the condition of positive definiteness of a Hermetian 
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quadratic form 

2 

Z C(xk,x,)z^z,. 

k,r=l 

To obtain (6) we set n = 3, = z, Z 2 = — z in (2). Then 

[C(xi, Xi) + C(x2, X2)-2 ReC(xi, X 2 )] |zp + 

+ 2 Re[C(Xi, X3)-C(X2, ^ 3 )] ZZ3 + C(X3, X3)|z3p^0, 

which implies (6). 

The cross correlation function characterizes the degree of linear 
dependence between the two random functions Ci(x) and with 
finite moments. 

Definition 2. Let CiW, C 2 W be Hilbert random functions, ECi(x) = a,(x). 
Then 



J') = E [Cl W - fli W] [C2 (y) - ai (;^)] 

is called the cross correlation function of Ci(x) and 

To describe the class of possible cross correlation functions and for 
solutions of many other problems, it is convenient to consider a sequence 
of Hilbert random functions (x), (x), . . . (x), xg A"as components 

of a single random vector function C(x) with values in Jf"*. Moreover as 
above, C(x) denotes a column- vector, and C*(x) denotes a row-vector 
with components CfeW = C^*^x), /c= 1, 2, ..., m. Set 

^(x)=EC(x) = {EC^^^(x), EC^^>(x),..., EC(->(x)}, 

R(x, y) = \R{{x, y)|- E(C(x)-a(x)) (C(y)-^(y))* . 

The vector function a(x) = (a^^^(x), ..., ( 2 ^'"^(x)) is called the mean value 
and R(x, y) is the matrix correlation function of C(x). 

We also note that 

R{ (x, y)=E (x) - a^^ (x)) (y) - a^^^ (y )) , 

7 , k=\, 2 ,..., m. 

Definition 3. A matrix function C (x, y) = || Q (x, y) || j\ k=l,..., m is 
called a matrix positive definite kernel on if for an arbitrary «, an 
arbitrary sequence of complex- valued vectors z^ (z^g JT"") and arbitrary 
points Xfe (XfcGZ) we have 

n 

Z (7) 

j,k=l 

A correlation matrix function is a matrix positive definite kernel. 
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Indeed, 



n 

Z Zk = 

j,k=l 

= E Z 2j(C(Xj)-a(Xj)){C{xk)-a{xk))* Zk = 

j,k=i 



= E 



k= 1 



2 



> 0 . 



We note several properties of matrix positive definite kernels C{x, v)- 

1 . Matrix C{x, x) is positive definite, i.e. 



z* C (x, x) z = Yj Q (^5 ^ (^) 

j,k=l 

2 . C*{x,y) = C{y,x). ( 9 ) 

3 . \C{{x,y)\^C)(x,x)Ci(y,y). ( 10 ) 

Property (8) coincides with ( 7 ) for n= 1 . Equality ( 9 ) follows from the fact 
that the matrix z^C(x, y) Z2 + zf C(y, x) z^ is real for any complex- values 
vectors z^ and Z2(zj^6 Jf'”). Observe also that inequality ( 7 ) is equivalent 
to the requirement that for any n and any Xj, X2, . . . the block matrix 

C(xi, Xi) C(xi, X2) ... C(xi, x„) 

C(X2, Xi) C(X2, X2) ... C(x2, x„) 

C(x„, Xi) C(x„, X2) ... C(x„, x„) 

be positive definite. Utilizing this remark we obtain in the case n = 2 
inequality (10). 

The property of positive definiteness is characteristic of the correla- 
tion (matrix) function. 



Theorem 1. In order that a function R{x^, X2), XjCX be a correlation func- 
tion it is necessary and sufficient that it be a positive definite kernel. 

Proof. The necessity follows from the proceeding definitions. Sufficiency 
follows from the fact that given a positive-definite kernel i?(xi, X2) one 
can construct a complex Gaussian random function C(v) for which 
R{x, , X2) is the correlation function. □ 

Remark. One can prove analogously that Theorem 1 is valid also for 
matrix correlation functions: in order that a matrix function i^(xi, X2) 
be a correlation function of the vector C(x), xeXfii is necessary and suf- 
ficient that it be a positive-definite matrix kernel. 

Let be a metric space with metric q. 
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Definition 4. A Hilbert random function {CW, is called continu- 

ous at point Xq in the mean square (briefly m.s. continuous) if 

E|C(x)-C(xo)|^^0 as e(x, Xo)->0. 

From Lemma 3 of Section 1 in Chapter I we obtain the following 



Theorem 2. For the m.s. continuity of C(x) at point Xq it is necessary and 
sufficient that the covariance B(x^, X 2 )=^^{xfj ^{xf) be continuous at 
point (xq, Xo). 



Remark 1. Stochastic continuity of C(^) at point Xq follows from the 
m.s. continuity of C(x) at the same point. Indeed in view of Chebyshev’s 
inequality 



P{IC(^)-C(xo)l>a}< 



E|C(x)-C(xo)|^ 

o2 



Remark 2. If C(-^) is m.s. continuous on SC (i.e. at each point of x) it does 
not mean that the sample functions are continuous with probability 1 
on SC. Indeed for the Poisson process we have E\^{t + h) — ^{t)\^ = Xh + 
+ but the sample functions C(0 discontinuous with a positive 
probability. 

Wide-sense stationary processes. If we assume that the random function 
C (x) possesses certain invariant properties with respect to variable x, then 
the class of corresponding correlation functions also possesses a certain 
invariance and it becomes possible to describe this class in more detail. 
In this subsection we shall consider limitations of this kind and a de- 
scription of the corresponding class of correlation functions will be given 
in the next. 

We start with an important generalization of the notion of station- 
arity of a random process. 

Let = 00 , oo), be a stationary process 

with values in Then the variables 



a (?) = EC (0, to) = E (C (?o + t)-a{to + 1)) (C (?o) - «(?o))* 

are independent of /, 

a(r) = a = const, R{ii, ti) = R{h~i 2 ^ — (1 1) 

Function R{t) = R{t + tQ, tf) is also called the [matrix) correlation function 
of a stationary process. 

Clearly, even if for a certain random process the equalities (11) are 
satisfied, still it is not sufficient for the process to be stationary. How- 
ever, in problems whose solution depends only on the values of the 
moments of the first two orders of the process, the stationarity condition 
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is utilized only to the extent expressed in relations (11). Therefore it 
would seem natural to introduce the following important class of pro- 
cesses first investigated by A. Ja. Khinchin. 

Definition 5. A Hilbert m.s. continuous random proces ^{t), —co<t<co 
with values in is called a wide-sense stationary process (or a stationary 
process in Khinchin ’s sense) if 

EC (f) = a = const , E (C (ti) - a) (C (^ 2 ) ~ i - ^ 2 ) • 

Let C(^) be a one-dimensional wide-sense stationary process. Since 
the correlation function i^(^i — ^ 2 ) is a positive definite kernel it follows 
that 



for any n, t^ e (— 00, 00) and (7 = 1 , . . . , n). Positive definite kernels on the 
linear space ^ which depend on the difference of the arguments are im- 
portant in various problems of analysis. These are called positive definite 
functions on 

Definition 6 . Let ^ be a linear space. A complex-valued function /(x), 
is called positive definite if for any n, XjS^ and complex numbers Zj 

0=1,2,...) 

n 

j.k=i 

A positive definite function has the following properties (cf. (3)-(6)) : 



1) 


m>o, 


(12) 


2) 




(13) 


3) 


l/WK/(0), 


(14) 


4) 


1/(^1) -f{xi)\ ^ ^ 2/(0) [/(O) - Re/(x2 - Xi)] . 


(15) 



In particular a positive definite function is bounded on Next if it 
is continuous at x = 0 it is then uniformly continuous on the whole 
space 

We now return to wide sense stationary processes with values in J""*. 
Each component of such a process is a one-dimensional wide-sense sta- 
tionary process and the cross correlation function of two components 
C^{t) and C^(t) of the process depends only on the difference of the argu- 
ments : 



ih, 1 2 ) (C*' 2 ) -a^)=Riih- ^ 2 ) • 
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Definition 7. If C(0 rj(t) are wide-sense stationary random processes 
and the compound process = Y\{t)} is also wide-sense stationary 

then the processes ^{t) and r]{t) are called stationary correlated {in the 
wide sense). 

It follows from this definition that any group of components of a 
wide-sense stationary process regarded as a “self-contained” stationary 
process is stationary correlated with any other group of components of 
this process. 

Definition 8. Let ^ be a linear space. The matrix function C{x) = || Cl (x)|| , 
is called positive definite if for any n, Xj, Zj, where 
XjG^ and Zj are complex vectors in Jf"*, 

n 

Z ^jC{xj-Xk)Zk^0. 

J,k=l 

Since C(xi— X 2 ) is a matrix positive definite kernel, C(x) possesses 
the following properties (cf. (8)-(10)): 

1) C(0) is a positive definite matrix, (16) 

2) C*(x) = C(-x), (17) 

3) lCj(x)|^^C/(0)C/(0). (18) 

It follows from Definition 5 that the matrix correlation function of 
a wide-sense stationary process is a positive-definite matrix function. In 
particular it possesses properties (16)-(18). 

The definition of a wide-sense stationary process directly carries over 
to the case of random sequences {C(«), n = 0, ±1, ±2,...}. In this case 

EC{n) = a = const, E (C (/: + «) — a) (C {k) — a)^ = R {n ) . 

The matrix correlation function here is a sequence of matrices. 

We now present a number of examples of correlation functions of 
wide-sense stationary sequences. 

Example 1. A standard non-corr elated sequence of random vectors 
{i{n), « = 0, ±1, +2, ...} is a sequence which satisfies the following con- 
ditions 

a=EC(^) = 0, 7^(0) = /, i^(«) = 0 for 

where / is the unit matrix. 

Example 2. Markov stationary Gaussian sequence. We shall confine our- 
selves to the case of a vector sequence with real components, zero mean 
vector and nondegenerate matrix 7^(0) = EC(«) C*(«)- H follows from the 
last assumption that the distribution of ^{n) is not concentrated in a 
proper subspace of the space of values of ^ {ri). 
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Since E{l^{n+\) — A^ («)) C* («) = 0 if ^ = 7^ ( 1 ) x JR " ^ (0), it follows from 
the Gaussian property of the process that C(«) and T^{n) = ^{n-\-\) — A^{n) 
are independent. Therefore 

E {C (« + 1 ) I c («)} = E («) + »7 («) K («)} = • 

Let be a cr-algebra generated by the random variables C(*5), s^n. 
Since the process is Markovian E {C (^ + «) | = E{l^{s + n)\C («)} . There- 

fore 

E{a^+«) I c(^)}=E{E{c(^+«) I I c{^)}= 
=E{E{c(^+«)ia^+«-i)}|c(^)}= 
=EK(^+«-i)U(^)}=^"a^)- 

Finally for 

R(n)=E{C{s + n)C*(s)} = E{E{C(s + n)\as)}C* (^)} =A"R (0) . 

Thus the correlation function of a stationary Markov Gaussian sequence 
is of the form 



R(n) = A^R{0), (n^O). (19) 

For example in the one-dimensional case {\a\ ^ 1) 

R [n) = ^ 0) . (20) 

Example 3. The process of moving averages. Let {^{n\ /i = 0, + 1, ...} be 
a standard non-correlated sequence of random vectors with values in 
Set 

00 

C(k)= Y, A^^{n-k), (21) 



where A^,k = 0, 1, 2, ... is a sequence of matrices (operators) which map 
into itself. The series in the r.h.s. of the last equality represents the 
sum of orthogonal vectors in ^^ 2 ^ {Q, S, P}. 

For the convergence of this series it is sufficient that 

00 00 00 

E X \A,an-kr^E Y \A,\^\^{n-kr=Y IAP<oo. 

k =0 k =0 fc =0 



Here \A\ denotes the norm of the matrix A, 



M| = VSp(X^*) = 



m 



Y 

j,k=i 



For the matrix correlation function we have for the expression 

R{n)= Y 

k=0 



( 22 ) 
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Example 4. Autoregression process. Let ^{n) be a one-dimensional stan- 
dard non-correlated sequence. Consider the finite-difference equation 
for determining sequence C (n ) : 

C{n) + biC(n-l )+ ... +b,C{n-s) = 

= aoi{n) + ai^{n-l )+ ... +a,i{n-s). (23) 

Many applied problems lead to equations of type (23) which is called the 
autoregression equation. Clearly if the values C(0), C(l), •••, C(‘S'— 1) are 
given, equation (23) will enable us to express successively C (‘S'), C (*^ + 1), • • • 
in terms of the “initial values” C (0), . . . , C (-^ — 1) and the values ^ (0), c^(l), .... 
Consider the problem of existence of the stationary solution of (23) in 
which C(«) is expressed in terms of ^{m), m^n. For this purpose we shall 
seek a solution of equation (23) in the form of a moving averages process 

C(«)= Z (24) 

k = 0 

Equation (23) is reduced to the system 

Cs + ^iC,-i + ...+Vo = «s. j (25) 

Cp + foiCp_i + ... + Vp-s=0 for p>s. 

We introduce the generating functions A{z), B{z), C{z) of the sequences 
(an), {^n} and {c„}, 
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and 



(26) 

k= 1 

Moreover if the roots of the polynomial B(z) are located outside the 
circle \z\ ^ 1, then the series is m.s. convergent. It is easy to see that this 
result remains valid also in the case when B{z) has multiple roots. We 
have thus proved the following 

Theorem 3. The autoregression equation (23) possesses a stationary solu- 
tion (24), (26) provided all the roots of the polynomial B{z) are located out- 
side the circle |zl ^ 1. The correlation function R[n) of this process satisfies 
the difference equation 

R{n)-\-b^R[n— 1)+ ... ■yb^R{n—s) = 0 , n>s, 

R{n) + b^R(n— 1)+ ... -\-bsR(n — s) = 

= + + ... for O^n^s. 

We now present several examples of correlation functions of wide-sense 
continuous-parameter stationary processes. 

One is prompted to consider, as the simplest example, the process 
C (0 such that 

EC(0 = 0, E|C(/)|" = 1, EC{t)W) = 0 for t^s. 

The correlation function of this process is discontinuous ; thus the process 
is not m.s. continuous and does not belong to the class of processes 
studied in this section. It can be shown that this process is not equivalent 
(stochastically) to a process with measurable sample functions. On the 
other hand, some processes similar to this example and having even 
more irregular behavior are studied in the theory of generalized random 
processes. 

Example 5. Random oscillations. Oscillation processes occur in many 
physical and technical problems; these are represented in complex form 
by a function of the type 

(27) 

k 

Each component of this sum represents a simple harmonic (periodic) 
oscillation with frequency uJ2n and power ly^l^. The totality of the 
quantities is called the spectrum (or the frequency spectrum) of 
process C(0- Assume that yj^ are mutually orthogonal random variables 



^7k=0, ^\yk\^ = Cu, ^7kYj = 0 for k¥=j. 
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Then the correlation function of the process C(0 is equal to 

R(t„ f^)= EC(fi) E ^ = 

k,j 

k 

i.e. C(0 ^ wide-sense stationary process. Its correlation function is 

completely determined by the frequency spectrum and the averages (the 
mathematical expectations) of measures of power corresponding to 
each one of the simple harmonic oscillations appearing in process C(0* 
In connection with this power representation we introduce the important 
characteristic of a stationary process called the spectral function of a 
process. 

The spectral function F(u) of process (27) is determined by the relation 
F(u)= I cl 

k, Uk<u 

This means that F{u) is equal to the average power carried by the har- 
monic components of the process C (0 frequencies smaller than the 

given value u. The function F{u) completely characterizes the mean 
power of each harmonic component of the process C(0 
total average power of the harmonic components of the process with the 
frequencies lying within any given interval. Indeed, 

cl = F{u^ + 0)-F{u^), X Ck=F(u 2 )-F(ui). 

Ui^Uk^U2 

In terms of the spectral function the correlation function of the process 
C(t) can be written in the form 




-- 00 



From the mathematical point of view the spectral function is a non- 
negative non-decreasing function continuous from the left, which is 
constant everywhere except at a finite number of points where it has 
jumps of size c^. It turns out that the notion of a spectral function can be 
introduced for arbitrary wide-sense stationary processes. This problem 
as well as the problem of generalization of representation (28) for arbitrary 
random processes is considered in the following sections. 
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complex-valued random variables {^[n), 1, 0, 1, ...} for which 

EC(«) = 0, Eak + n)W)=^R{n). 

The sequence of numbers R{n) is positive-definite, i.e. for any n and any 
complex numbers = 1, 2, 

n 

j,k=0 

Theorem 1 . The function {R[n), « = 0, ±1, +2,...} is a correlation func- 
tion of a wide-sense stationary sequence of random variables iff it can be 
represented in the form 



R{n)= e^^^F{du), 



(1) 



where F(-) is a finite measure on [ — 71,71]. The measure F is uniquely 
defined on the Bor el sets of the interval [ — 71,71]. 

Proof. Sufficiency. Sequence (1) is positive definite since 



n 



Z Rij-k) 

j,k = 0 






Tt 



J 

— n 






j=o 



F{du)^0. 



Therefore the sequence is a correlation function of a certain wide-sense 
stationary sequence. 

Necessity. Let R{n) be a correlation function of a certain wide-sense 
stationary sequence. Put 



f[u,Q)=Y, e-^'^^-^^^R{n-m)Q^^^, 0 <^< 1 . ( 2 ) 

M=0 m=0 



The series in the r.h.s. of (2) is absolutely convergent since 



N N 

Y Y \e~'^^~^^'*R{n-m)Q^^^\^R(0) 

n=0 m=0 



Z 



|n = 0 



^ ^(0) 



It follows from the positive-definiteness of R{n) that/ (w, q)^0. Changing 
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the order of summation in (2) we obtain 
00 00 

f(u,g)= Z X 

k= — 00 j =0 — oo ^ Q 



00 ^\k\ 






The relation obtained shows that the quantities - — are the 
Fourier coefficients of a positive function f{u, q). Hence 

1 r 

^ ^ («) = ^ J e‘"“/ (u,g)du 



or 

It 

0l"ljR(n)=| e'”^F^{du), (3) 

— 7C 

where 



FAA) = 



1-g" 

2n 



f(u, g)du. 



A 



and Fgl~n, ji]=R{0)<co. The family of measures F^(*) on [ — tt, tt] is 
weakly compact. Therefore a sequence can be found such that 
^Qk(’) converges weakly to a certain measure F(-). Approaching the 
limit in (3) SiSQ = Qk-^l we obtain formula (1). 

We now prove the uniqueness of measure F. Assume that there exist 
measures F^ and F 2 defined on the Borel sets of the interval [ — 71,71] 
such that R {n) can be represented by (1). Denote by K the class of Borel 
functions / (u) on [ — tt, tt] for which 



f(u)Fi (du) = j* / (m) f 2 (du) . 

— n —It 

The class K is linear and closed with respect to the operation of uniform 
limits and limits of bounded monotone sequences of functions. Since this 
class contains functions of the type ^*"“(w = 0, ± 1, ...) it contains, in view 
of Weierstrass’ approximation theorem for continuous functions, all 
continuous functions and hence all the bounded Borel functions. Putting 
f (w) = (u) where A is an arbitrary Borel set on [ — 7t, tt] we obtained that 

Fi(A) = F 2 {A). The theorem is thus proved. □ 
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The measure F is called the spectral measure of a stationary sequence 
and the corresponding distribution function F{u) = F{~co, u) is called 
the spectral function. If F{du)=f{u)du, i.e. if measure F is absolutely 
continuous with respect to the Lebesgue measure, then / (u) is called the 
spectral density of the sequence C (n). We note that condition 

00 

X; |R(n)|<co 

— 00 

assures the existence of a spectral density. Indeed in this case the Fourier 
series 



2nf{u)= X (4) 

n= — 00 

converges uniformly and absolutely. Therefore 

n 

J?(n)=| e'”"f{u)du. 

— It 

Homogeneous random fields. We shall generalize Theorem 1 to the case 
of continuous parameter wide-sense stationary fields. 

Definition 1. A random function {CW, is called a homogeneous 

field in if 

EC (x) = a = const, 

R(xi, X2)=E[C(xi)-a] [C(x2)-a]=/?(xi-X2). 

Thus a correlation function of a homogeneous random field R{xi, X 2 ) 
depends only on the vector which joins the points x^ and X 2 - The function 
R{x) in the r.h.s. of equality (5) is also called a correlation function of a 
homogeneous field. The condition of positive definiteness of a correlation 
function is of the form 




X R(xj-x^)zjz^^0. 

j,k=l 

It follows from relation 

E I C (x + /i) - C (x)p = 2 [i^ (0) - Re (/i)] 

that if function R{x) is continuous at x = 0 then the field C(^) is m.s. 
continuous at each point xe^"*. 

Theorem 2. In order that function 7^(x) (xe^"*) be the correlation function 
of a homogeneous m.s. continuous random field {C (x), xe^^} it is necessary 
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and sufficient that it admit representation 



R(x)= e‘^^’"ffi{du), 



( 6 ) 



where F is a finite measure on the Bor el sets of Moreover the measure F 

is uniquely determined on 

Sufficiency. The function R(x) determined by formula (6) is continuous 
and positive-definite : 



n 



I 

J,k=l 



R(xj-x^) 










F{du) = 




2 






F{du)^0. 



Therefore R{n) is the correlation function of a certain m.s. continuous 
complex Gaussian field (cf. Section 1 of Chapter III). Moreover one can 
construct a very simple example of a homogeneous field with the correla- 
tion function given by (6). To do this we introduce a random vector in 
with the distribution 



P{ieA}=fF(A), Fo=F(i%'") 

for an arbitrary Borel set Aadi"'. Set = where q> is a 

random variable uniformly distributed on { — n, rr), and <p and ^ are 
mutually independent. Then 



EC(x) = 0, 



F(x, y)= EC(x) C(y) = FoEe-<^’"-^> = 









Necessity. We show now that an arbitrary continuous positive definite 
function admits representation (6). It follows from the condition of 
positive definiteness that for an arbitrary function g(x) integrable in 
the inequality 

R(x-y) g{x) g{y) dx dy^O 



|xl^ 

is valid. We set ^(x) = exp^ where iV>0, and ze 
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Then 



r r ( 

R(x — y) exp< - 

J V V. 



i?(x->')exp<J - — i(x-y, z)\dxdy^Q. 



2N 



Carrying out the following orthogonal transformation of the coordinates 
in 

x — y = ^u, x-\-y = y/2v, 

we obtain 
0^ 



J R{u) exp| — ^ ^ — i(m, z)^dudv = 



2N 



= {2nN)^^^ J R {u) exp | — — i (w, z)| du . 



Thus the function 



is non-negative. Moreover this function is the Fourier transform of an 
integrable continuous function K(w) and is also differentiable. 

We show that Rn(z) is integrable. Since R^iz) and e>0, are the 

Fourier transforms of the functions 

R{u) g-m/2 ^-\u\2l2e^ 

correspondingly, it follows from Parseval’s equality that 

31"' 01"' 

1 



<R{0) 



„m/2 



e-l“P/2^ du = (2n)'"/2 J?(0). 



Let e-»0. Utilizing Patou’s lemma we obtain 

Rf,{z) dz^ilnf^ R{0). 



From the integrability of K^(z) it follows that the inversion formula for 
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Fourier transforms is applicable to this function 



R{u)e 



-\u\2/2N 



— z) 

J ( 






Rff{z)dz-. 



where 












A 



1 



Rf^(z)dz. 



( 7 ) 



Thus the function 



R{0) 



e I is the characteristic function of a cer- 



tain distribution in and converges as N-^co to a continuous function. 
Therefore (Chapter I, Section 1, Theorem 3) R{u)/R{0) is also a charac- 
teristic function. The uniqueness of measure F in representation (6) 
follows from the theorem on the uniqueness of a distribution with a 
given characteristic function (Chapter I, Section 1, Theorem 2). The 
theorem is thus proved. □ 

As in the case of sequences, measure F(-) in representation (6) is 
called the spectral measure, and the corresponding distribution function 
F(u) = F{I^), where 7„ = {x:x<m, is called the spectral function. 

If the spectral measure is absolutely continuous: 



F(A)=^ f(u)du, 

A 

then / {u) is called the spectral density of a random field. If the spectral 
density exists, then the spectral representation of a correlation function 
becomes 00 



R(x)= (m) du. 



We note the following criterion for the existence of the spectral 
density: if R{x) is an absolutely integrable function (xe^"*), then the 
spectral density exists. 

To verify this criterion, we utilize the notation and relations obtained 
in the course of the proof of the preceding theorem. By virtue of Parseval’s 
equality for Fourier integrals we have 



Rn{z) dz = 



^iv(z) XK{z)dz = 



K 
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i 



= R{u)e 



■\uPI2N 



01 ’” 



(Inf'll 



n 



+ h^)u^ ~ h^)u^ 



du. 



where K = {z:x* — /i^<z^<x^ + /i^}. Hence 



\R{u)\ du. 



where V{K) is the volume of the parallepiped K. Hence measure F{ ) 
is absolutely continuous with respect to Lebesgue’s measure. □ 

Corollary 1. A function R{t), te{ — co, co) is the correlation function of a 
wide-sense stationary process if and only if 

OO 

/?(t)= I e'"^F{du), 

- 00 



where Ff) is a finite measure o/7 

Corollary 2. A function J{u), ue^"^, /(0)= 1 is the characteristic function 
of a distribution in if and only if it is continuous and positive-definite. 

Homogeneous and isotropic fields. Formula ( 6 ) can be further specialized 
if certain additional assumptions are imposed on the random field. An 
important and at the same time quite general property is the isotropy of 
a random field. A random field is called isotropic if its correlation function 
X2) depends only on X2 and on the distance between the points 
and X2. If the field is, in addition, homogeneous then 

jR(xi, X2) = R{q) 



where q is the distance between x^ and X2, Q= / ^ {x{ — x{Y . 

V j=i 

We derive a representation of the correlation function of an m.s. 
continuous homogeneous and isotropic random field. Since the field 
is homogeneous its correlation function is of form ( 6 ). Integrating both 
sides of this formula over the surface of the sphere of radius^ we obtain 

m = I {j F{du), 

Sq 

where s{dx) in the inner integral denotes the integration over the surface 
of the sphere Sg. Note that if Vg denotes a sphere of radius q with the center 
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at the origin, then 

f{x) s(dx)=~ j* f(x) dx. 

On the other hand 




Vq 



where I^(x) is the Bessel function of the first kind. It follows from here 
that 



j = (8) 

We introduce a measure g on the semi-axis [0, oo) by putting g{\_a, b)) = 
= F{Vf\V^}, O^a^b, where denotes the open sphere of radius q. Then 

00 

0 

and g ([0, oo)) = F = R (0). 

We have thus proved the following theorem. 

Theorem 3. In order that R (^) be a correlation function of a homogeneous 
and isotropic m.s. continuous m-dimensional random field, it is necessary 
and sufficient that this function admit representation (9), where g is a finite 
measure on [0, oo). 

For n = 2 formula (9) becomes 



and for n = 3 






Ioi^Q)g{dX), 



0 



/?(e) = 2 



00 



0 



sinA^ 



g(dl). 



( 10 ) 



( 11 ) 



The same argument shows that the m.s. continuous random field 
^{t, x), CO — <t< 00 , will be homogeneous in variables (t, x) and 
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isotropic in “the spatial” variables x, i.e. its correlation function will 
depend only on t and q : 



E^(f + s, x)^{s,y) = R{t,Q), 

where q is the distance between x and y, if and only if this correlation 
function is of the form 



CO 



R(t,Q)- 



00 

f* 



g{dv x dX), 



— 00 0 



( 12 ) 



where 



^m{x) = 



2 



( m - 2)/2 




^( m - 2)/2 



(x) 



(13) 



and ^ is a measure on the half-plane (A, i;), Ag[0, oo), fG (—00,00). 

We now obtain the general form of the correlation function of an 
m.s. continuous homogeneous isotropic field in a Hilbert space. If R{q) 
is such a correlation function then for any m the function R{q% = 

m 

— Z ^ correlation function of an m.s. continuous homogeneous 

fe=i 

field in We note that the function possesses this property 

for any 1. Indeed, for any m 










m 

i.e. the function Q^= Yj Fourier transform of a positive 

k= 1 

function and therefore is positive-definite. From here it follows that the 
function 



00 

R{e)=je-^^‘^i^g{d2) (14) 

0 

is also positive-definite for any finite measure g on [0, oo) and for any m 

m 

if Q^= Y • We show that formula (14) exhausts all possible positive 

k=l 

definite continuous functions in a Hilbert space which depend only on q. 
Theorem 4. In order that function R[q) be the correlation function of an 
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m.s. continuous homogeneous and isotropic random field in a Hilbert 
space it is necessary and sufficient that it be of the form (14). 

The sufficiency follows from the arguments above. To prove necessity 
we note that, in view of Theorem 3 and the discussion following this 
theorem, we have for each m 



R{q)= 



('ie) Gm (d^), 0m[O,CO) = R (0), 



i3„(x)=r(-jM V-2,/2(x)= 



=1-^+ 



2m 2*4-m(m + 2) 2-4-6m(m + 2) (m + 4) 



+ .... 



Moreover for m^oo uniformly on each finite interval 

|x|^AT. Therefore it is sufficient to prove the uniform boundedness of 
the family of functions for xe[0, oo) and the weak compactness 

of the family of distributions g^{y/mu). With this in mind we observe 
that (8) yields the following equation 






^m + 2{Q) = — 



v{s,) 



where is the sphere |m 1 = ^ in the space K(Sg) is its surface area 
and \z\ = l. Therefore 



To prove the weak compactness of the sequence of distribution functions 
= w) we multiply the relationship 



00 

R(0)-R(e) = | {l-Q„{QU^))g„{du)^ 



> 



I 



{l-i2„{QU^))g„{du) 



2/a 



by Q and integrate it from 0 to a. We thus obtain 
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a 00 a 

^ [K(0)-R(e)] Qdg^ J* Q„{QuJm) g dg'j gjdu). 

0 2/a 0 

It follows from the formula 
2 

that for m^3 and u^- 
a 

a 

2 '* 2 ^fYi 2 ^ 

-2 ^m{QU^)gdg=~^^~\\-Q„-2{au^)'\^^, 

a J a^u^m 

0 

hence 

^ [R (0) - R (e)] 0 ^ ^ ? °°)) ■ 

0 

The compactness of the measures follows from the fact that the left- 
hand side of the last inequality tends to zero as a->oo (Chapter I, Sec- 
tion 1, Theorem 1). The theorem is thus proved. □ 

Vector- valued homogeneous fields. Let (C(a:); be a vector- valued 

random field with values in This field is called homogeneous if EC (x) = 
= <3 = const (in the sequel we shall assume ^^ = 0) and if 

R (xi , X2) = EC (xi) C (^2)* “ ^ (->^1 - ^ 2 ) • 

A matrix countably-additive set function F(^)= {F;tj (^)}5 k,j=l,.,., s, 
is called positive definite if the matrix F(A) is positive definite for 
any ^g®"*, i.e. if for any cg the set function p^(A) = c*F(A) c is a 
finite measure on ®"*. Applying Theorem 2 we obtain the following 
result. 

Theorem 5. In order that function R{x) be the matrix correlation function 
of an m.s. continuous homogeneous vector field C(^) it is necessary and 
sufficient that 

(.5) 

where F is a positive definite matrix countably additive set function on 
S'”}. 
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Proof. Let R{x) be the matrix correlation function of an m.s. continuous 
homogeneous field CW- For any we introduce the scalar field 

(^(x) = (C(x), c). This field is clearly m.s. continuous and homogeneous, 

EC,(x) = 0, R, (x) = EC, (x + Xo) C (xo) = c*R{x)c. (16) 

In view of Theorem 2 the correlation function can be represented 
in the form 

i?,(x)= I (17) 

where is a finite measure in ®"*}. Let e{ = Si, R (x) = {R{{x)}, 

/c,7 = l,..., s. Then R^^{x) = Rkk{x). Set ekj = ek + ep e^j^iekFej. It is easy 
to obtain that 

2R{ (x) = (x) - R,^ (x) - (x)] - i [R,-, , (x) - R,^ (x) - R, . (x)] . 

If we put 

Ft(A) = F,Ml F{{A) = lF,jA)-F,,(A)-Fjj(A)-]- 

-i\.F,jA)-FM-Fjj(A)'], 

then it will follow from (16) that 







and moreover F{{A) are countably additive (complex-valued) finite set 
functions on S'”}. In view of the uniqueness of representation (17) 
c*F{A) c = F^{A), which implies that the matrix F{A) = {F{{A)}, k, 
j= 1, ..., s is positive definite. The necessity is thus proved. 

To prove the sufficiency one must show that the function R{x) 
defined by formula (15) where F satisfies the conditions of the theorem 
is a continuous positive-definite matrix function. Its continuity is obvious. 
Moreover, for any z^e JT"' 



n 



Z z*R(Xp-x,)z, 
= 1 



J w^F(du)w = F^{^^)^0, 



where w = ^ c z^. The theorem is thus proved. □ 

p= 1 

Theorem 1 admits an analogous generalization. 

Theorem 6. A sequence of matrices R{n) = {R{(n)} n = 0, ±1, ±2,... is 
the matrix correlation function of a wide-sense stationary vector-valued 
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sequence {C(«), « = 0, ±1, ±2, ...} if and only if it can be represented in 
the form 



It 

{* 



R(n) = 



e‘'“‘F{du), 



where F(A) is a matric positive definite countably-additive set function 
defined on the Bor el sets of the interval [ — 7i, tt]. 



§ 3. A Basic Analysis of Hilbert Random Functions 



The study of Hilbert random functions is formally the study of functions 
in the ordinary sense with values in a Hilbert space. However, since, when 
analysing Hilbert random functions, we utilize the notion of co-variance 
and other specific probabilistic notions and investigate various types of 
convergences, the problems dealing with random functions have certain 
specific features. 



Integration. Let 91, m] be a complete separable metric space with a 
(7-finite complete measure and let {C{x), xe^} be a Hilbert random 
function. Assume that C (x) = C is a measurable and separable 
random function. As it is known from the above (Chapter III, Section 3), 
if the covariance B{x, y) is continuous at point {x, x) m-almost for all x, 
then for any C (x) there exists a stochastically equivalent measurable and 
separable random function. This observation shows how restrictive the 
above assumption is. Theorem 2 (Chapter III, Section 3) yields the 
following corollary: 



Theorem 1. If 



then with probability 1 



and 



J B{x, x) m{dx)< CO , 

sc 



1 



|C(x)p m{dx)< 00 



|C(x)|^ m(dx) = J B{x, x) m{dx). 
sc 



(1) 



( 2 ) 



Corollary. Let f(x), /=!, 2 be functions belonging to ^ 2(^5 

let condition (1) be satisfied. Then with probability 1 the following integrals 
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exist : 

fli= fi{x)C{x)m{dx), 

3C 

and moreover in view of Fubinfs theorem 

fi (^) A ( 3 ^) C (^) C W m (dx) m (dy) = 

=1 |/iW^( x,y)f 2 (y)m{dx)m{dy). 

ac SC 

A few remarks concerning the definition of integrals of random func- 
tions would seem in order. 

Remark 1. Let condition (1) be satisfied and let m(^)<oo. Then the 
integral 

^C(x)m{dx) (3) 

SC 

where is a measurable random function, is defined and finite with 
probability 1 for each realization of C(^). However a different approach 
can be taken for the definition of integral (3). Firstly, integral (3) can be 
defined as the m.s. limit of the Lebesgue integral sums of C(x). It is easy 
to verify that this definition coincides with the usual one. To prove this 
it is sufficient to consider non-negative random variables. By definition 
integral (3) is the limit as 00 of 




J Cn{^)m{dx), 

sc 

where is a monotonically non-decreasing sequence of random 

functions taking on a finite number of values and such that lim (x) = C (x) 
with probability 1. Since |C(x) — C„(x)|^|C(x)|, and in view of Lebesgue’s 
theorem on bounded convergence we obtain that 
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as n-^oo, so that 



C(x) m(dx) = lA.m. 




m{dx). 






dc 



Remark 2. Consider a random process {C(t), te\_a, fe]}. The integral 



C (t) dt 



is often defined as the m.s. limit of the integral sums 

n 

fc= 1 

^ ^nk ^nk ^nk — 1 ? ^ ^nO ^nn ~ ^ * 

In view of Lemma 3 (Chapter I, Section 1), it is necessary and sufficient 
for the existence of the m.s. limit of these sums that the Ihnit of 



E X C{tnk) ^^nk X X! X ^{^nkf ^mr) ^^nk^^mr 

fc=l k=l k=l r=l 

exist as n, m-^oo, i.e. that the function B{t, s) {a^t, s^b) be Riemann 
integrable. Thus the given definition of the integral is more restrictive 
than the original one but has the advantage of not being dependent 
on the notion of measurability of the process. It is easy to verify that 
the latter definition of the integral - when applicable - coincides (mod P) 
with the initial definition. 

Indeed 






k= 1 



tnk tnr 
tnk-1 tnr-l 

n n 

+ Bit„u,t„r)]dt ds^2 Y, Z ^nkr-*0, 
k=l r=l 

where is the oscillation of the function B(t, s) in the rectangle 

^nk-l ^nr -1 
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Remark 3. The improper m.s. integral 




is defined as the limit 



N 

l.i.m. 

N->oo , 



C{t)dt 



-N 



or 



C{t) dt 



N 



l.i.m. C{t)dt 



(4) 



In view of Lemma 3 (Chapter I, Section 1) a necessary and sufficient 
condition for the existence of these integrals is the existence of the limits 



N N' 



lim 

iV,iV'-^oo 




B(t, s) dt ds 



N N' 



lim 






B{t, s) dt ds 



This definition of improper integrals is, in certain cases, less restrictive 
than the interpretation of the integrals (4) as Lebesgue integrals of func- 
tion C (t) for fixed a>. 

The law of large numbers. Let {C(^), ^^0} be a measurable Hilbert 
process with integrable covariance in each finite interval. We say that 
{C [t), satisfies the law of large numbers if 

T 

0 

approaches a constant c in a certain sense as T->oo. 

It follows from Lemma 3 (Chapter 1, Section 1) that in order that 
the mean 



T 

[ C{t)dt 

T->-co 1 J 
0 

exist, it is necessary and sufficient that the limit 

T T' T T' 

lim I C(t)dt^, I C(t)dt= lim [ [ B{t,s)dtds 

T,T'-^ao 1 J 1 J T,T'^odTT j j 

0 0 0 0 
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exist. Furthermore for the validity of the equality 



l.i.m.*, 

T-oo \ T 



at)dt-~ 



EC{t)dt> = 0 



it is necessary and sufficient that the relation 

T T' 

^ hm ^ J J R{t, s) dt ds = 0 
0 0 

be valid, where R(t, s) is the correlation function of the process. 
It is easy to observe that 



T T' 



T T 



T' T' 



II- 



;) dt ds 



< 



R{t, s) dt ds R(t, s) dt ds. 



( 5 ) 



Therefore equality (5) holds if arid only if 



T T 



lim — - R{t, s) dt ds = 0. 

T^O T 



( 6 ) 



For a wide-sense stationary process R(r, s) = R{t — s). Since 



\ R(t-s)dtds=^ I R{t)[l-^]dt, 



0 0 -T 

we obtain the following result. 

Theorem 2. is a wide-sense stationary process, then for the equality 

T 

V) 



l.i.m.i I C(t)dt=EC(t) 

T-^oo 1 J 
0 



it is necessary and sufficient that 






( 8 ) 



In particular, conditions (8) of the theorem are satisfied if the mean 
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value of the correlation function is zero : 

T 

lim — I R(s)ds = 0. 

r-oo 2T J ^ ^ 

-T 



We express condition (8) in terms of the spectral function of the 
process. We have 

i j j F(du)i j 



-T 



hence 



1 

T 



R(t) 



00 



2(1— cos Tu) 



-T 



TV = 

00 

=f({0})+ j 



2(1 — COS Tu) 

TV 



F{du), 



where F{A) = F{A\{0}), where {0} is the singleton containing the point 
M = 0. It is easily verified that the last integral tends to 0 as T-^oo. There- 
fore 

T 

to (9) 

-T 



Thus the following theorem is valid. 

Theorem 3. For a wide-sense stationary process, the equality (7) holds if 
and only if its spectral function is continuous at the point w = 0. 

Differentiation. Let te{a, b)} — oo^a<b^ + co be a Hilbert ran- 
dom process. 

Definition 1. A random process C(0? b) is m.s. differentiable at to 
( mean square differentiable) if there exists the limit 

= to,to + he(a,b). 

h^o n 



The random variable C'(^o) is called the m.s. {mean square) derivative 
of the random process at point ^o- 
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It is easy to obtain the necessary and sufficient conditions for the 
m.s. differentiability of a random process. Since 



^c(ro+/*)-a?o) ato+hd-uto) 1 ,,,, , , . , , ^ 

E • = — {B(to + h, to + hi)- 

h h h-^h 

— tQ-\- h^) — to}-{-B{tQ, /q)}? (10) 



it follows from Lemma 3 (Chapter I, Section 1) that for the m.s. differ- 
entiability of the process C (0 h ll is necessary and sufficient that the 
generalized mixed derivative 



d^B{t, t') 
dt dt' f 



t' 



= lim 

h,hi^O 



B{tQ-\-h, tQ-\-hi) — B{tQ, tQ + hi) — B{tQ-\-h, tQ)-\-B{tQ, Iq) 

hh. 



exist. 

It follows from the m.s. differentiability of the process at point t and 
the inequality 



EC'W- 



h 






C'(0- 



at+h)-g t)\ 
h 



2 -) 1/2 



that 

mo-j, Ecw, 



( 11 ) 



and, moreover, the derivative on the right exists. 

If the process is m.s. differentiable at each point te{a, b), then the 
derivative C{t) forms a Hilbert random process on (a, b). 



Theorem 4. Let Z?)} be a Hilbert random process and let the 

generalized derivative 



d^B{t, t') 

dt dt' j 



exist for each value of te{a, b). Then the process C(t) is m.s. differentiable 
on {a, b) and 



t') = 



d^B(t, t') 
dt dt' 



(12) 






dB{t, t') 



( 13 ) 
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where (t, /') = EC' (^) C' [t') is the covariance of the process C' (t), and 
{t, t') = EC' (^) • C (i') is the cross covariance of the processes C' (^) and C (t). 
Only formulas (12) and (13) require a proof. We have 

h^o h h^o h 

dBit, t') 

Consequently, the derivative — — exists and the cross covariance of 
the processes C'(0 C(0 is given by formula (13). Furthermore 

lim 

h,h'-^0 h h 

B{t-\~h, t' -\-h'^ — B{t^ ^'+/z') — B{t-\-h, t'^-\-B{t^ /') 

= lim — . 

h,h'-^o hn 



= lim 

h, h'-^O 



We thus obtain the existence of the generalized second derivative 

d^B(t, f) 

dt df 

(its existence in the condition of the theorem was assumed only at t = t') 
and the validity of (12) is verified. □ 

If the process C(t) is wide-sense stationary, then B{t, t') = B{t — t') and 
Theorem 4 implies 

Corollary 1. In order that a wide-sense stationary process C(0 {teT) be 
m.s. differentiable it is necessary and sufficient that the generalized second 
derivative of the correlation function R{t) exist at t = 0. If this condition 

d^R(t) 

is satisfied then the generalized derivative ^ ^ exists and 
R it t 

(^0 + ^9 ^o) — ’ 

Analogous results are valid for the m.s. derivatives of higher orders. 

Corollary 2. If ^{t) is a wide-sense stationary process, te(—co, oo) and 

00 

f u^F{du)oo> , 



— 00 
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where F is the spectral measure of the process, then C (t) ism.s. differentiable, 
(C'(^), C(/)) is a wide-sense stationary process and its matrix correlation 
function R (t) is of the form 



00 00 

C (* 



e'^'*u^F(du) 



e^^'"iuF{du) 




e^^^iuF {du) 



— 00 



— 00 
00 

e““F{du) 

% 

— 00 



Decomposition of a random process into orthogonal series. Let 

/G [«, 6]} be a measurable m.s. continuous Hilbert process. Its covariance 
B(ti, ^ 2 ) is a continuous non-negative definite kernel in the square 
\_a, b~\ X \_a, b~\. According to the theory of integral equations, the kernel 
B(ti, ^ 2 ) can be expanded into uniformly convergent series in terms of its 
eigenfunctions <p„ (/) : 



where 



h)= E K9n((l)9n(t2), 

n= 1 

b b 

= | t:) dx, J (p„(t) (P^{t) dt=3„^. 



moreover the eigen-values are positive. 
Set 



b 

-I 



at)9n{t)dt. 



This integral exists (Theorem 1) and, in view of the corollary to Theorem 1, 



b b 






= 1 J B(t, t) <p„{t) 



it) <?>mW dt d% = X„3 nm ? 



i.e. the sequence of random variables (n=l, 2, ...) is orthogonal. 
Furthermore 

b 






B(t, x) (p„(x) dx=X„(p„{t). 
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It thus follows that 



at )- 1 aMt) 



k=i 



k=l k=l 



= B(f, f)- E k\(Pk{t)\^->-^ 

k=l 

as n -^00 uniformly in t in view of Dini’s theorem. 

Theorem 5. A measurable m.s. continuous Hilbert process CW, te[a, 6], 
admits the series expansion 



at)=z ^k<Pk(t), 



k= 1 



(14) 



which is convergent in ^ 2 /^^ te\^a, Z?]. In this expansion, is an 
orthogonal sequence of random variables, where are the 

eigenvalues and (p^ (/) the eigenfunction of the covariance of the process. 



Remark 1. If the process C(t) is Gaussian, then its m.s. derivative and 

b 



integrals of the form 



/ {t) C (t) dt are Gaussian random variables. There- 



fore if C{t) is a real Gaussian process and EC(t) = 0, then the coefficients 
ik in series (14) are independent Gaussian variables and series (14) is 
convergent with probability 1 for each t. 

Indeed, the independence of the variables follows from the fact that 
they are orthogonal and Gaussian. For convergence with probability 1 

00 00 

of series (14) it is sufficient that the series ^ ^ ^k\9k{^)\^ 

k=l k=l 

be convergent. However, as it has already been mentioned, this series is 
convergent (and its sum is equal to B{t, t)). 

Theorem 6. If 

then for any e > 0 



P 



sup 



C(t)-Z ik(Pk{t)\ 



>e>^0 as n->oo. 




228 



Chapter IV. Linear Theory of Random Processes 



The proof is based on Lemma 1 of Section 5 in Chapter III. Set 



C„(t)= I ^k<pM at)-Ut) = Cn{t), 7n= sup |CW-C„(f)l- 

k=l a^t^b 

Then 

P {y„>6} ^ p|lC(0)| >^| + pj^sjup^ |^„(t)_^„(0)l >i| < 

^4E\C'M\J 

where Q{n, c) and G are as defined in the above-mentioned lemma. We 
have 



-cvihy 

where 

h)=E\c„{t+h)-t:’„{tr= 

= E Z ikL(Pk{t+h)-(pk{t)'] 

k = n+l 

00 

= Z h\<Pk(t + h)-(Pk{tT- 

k = n+l 

Taking (15) into account, we observe that the functions x 

(T^{t, h)'(L\h\y^ (0<r'<r) are continuous with respect to te[a, h], 
he[0, ho] and are monotonically decreasing as n increases and approach 
0 as n^co. In view of Dini’s theorem this convergence is uniform. Con- 
sequently, 

tela, ft], fte[0, fto]j=^„-0 

L\h\ J 



as n^oo. 
Putting 



and 



0 (ft) = |lg|ft|| 0 <r"<^, 



^„(C, ft)=- 



LS„\h\ 



C^g^{h) I lg|ft| 1 ^+'''’ 
(cf. Chapter III, Section 5) we obtain that 



KS„ 



G<oo, e(«, C)<— 
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where K is a constant independent of n. Thus Qyn,—j->0 as n-^co. 

Moreover E|(^0)|^^0 also. The theorem is thus proved. □ 

Consider, as an example, the expansion of a Brownian motion into an 
orthogonal series on the interval [0, 1]. Here C(0) = 0, EC(t) = 0, \/C(t) = t, 
B{t, s)= C(s) = min(r, s). The eigenvalues and eigenfunctions of the 

kernel B{t, 5 ) are easily obtained. From the equation 



1 

-I 



K(Pn{t)= niin(t, s) (p„(s) ds= s(p„(s) ds-\- 



t 

=1 



t(p„ (s) ds 



we have firstly (p„{0) = 0. Differentiating with respect to t we obtain 
1 



ln(Pn{t)=\ (p„{s)ds, hence <p^l) = 0. Repeating the differentiation we 



t 

obtain equation X„(p'^{t)= —(p„{t). The normalized solutions of the last 
equation which satisfy the boundary conditions (^„(0) = 0, (p'„{l) = 0 are 
of the form 

<Pn(0=>/2 sin(n+|) Tit, =[n + jfn^, n = 0, 1,.... 

Thus 



_ 00 

C(0=V2 X 



n = 0 



sin(n + T) nt 
{n+^)n ’ 



(16) 



where is a sequence of independent Gaussian random variables with 
parameters (0, 1). For a fixed t this series is convergent with probability 1. 
Since C {t) is a Gaussian process and E |C (r + /i) — C {t)\ ^ = /i, it follows that 



sup 



cw-72 t 

k= 1 



sin(/c+^) nt 
(k + j) n 



0 



in probability. 

Another expansion of a Brownian motion process can be obtained as 
follows. Set ^(r) = C(t) — Then i{t) is a Gaussian process with co- 
variance B^ (t, 5) = min(f, s) — ts and E^(t) = 0. The eigenvalues and eigen- 
functions of the kernel B^ (t, s) are obtained in the same manner as in the 
previous case. We again arrive at equation X„(p'^{t)= —(p„(t) with the 
boundary conditions (p„{0) = (p^{l) = 0. The solutions of this equation are 
of the form 



(p ^{t) = .^/2 sin nnt, ^=n^n^, n = l,2, .... 
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Thus 



. . , , , , r- ^ sin nnt 

z — , 

rt = 1 

where ^„(n=l, 2, ...) is a normalized sequence of independent Gaussian 
random variables and moreover 



1 



Since 



= y/2 J { {t) sin nnt dt . 



EC(1) = 1, EC^(1)=1, 



E«(l) = 



E(C(t) — tC(l)) C(l) sinuTit dt = 0, 



putting (^o = C(l) we obtain 



oo 

m = tio + y/2Y. L 



Sin nnt 



n=i nn 



(17) 



where sltq independent and are normally distributed with 

parameters (0, 1). The convergence properties of series (17) are the same 
as those of (16). 



§ 4. Stochastic Measures and Integrals 

Integrals of the form 



b 



f{t)dC{t), 



(1) 



play an important role in a number of problems. Here / (/) is a given 
(non-random) function and ^(t) is a random process. Generally realiza- 
tions of the process C (0 functions of unbounded variation and the 
integral (1) can not be interpreted as a Stieltjes or Lebesgue-Stieltjes 
integral, existing for almost all realizations of C(^). However, even in 
this case, integral (1) can be defined in a manner such that it will possess 
properties shared by the ordinary integrals. 

In the present section we define and investigate properties of integrals 
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in which the integration is taken with respect to a random measure. Such 
integrals are called stochastic integrals. 

Let {Q, S, P} be a probability space, ^2 = ^2 (^5 L be a set 
and SR be the semi-ring of subsets of E. Assume that for each d gSR there 
corresponds a complex-valued random variable C(^) satisfying the 
following conditions: 

1) C(^)e^2 C(0) = O: 

2) C(^iud2) = C(^i) + C(^2)(modP), if dind2 = 0; 

3) EC(^i)Cp^=m(d,nd2), 
where m(d) is a set function on SR. 

Definition 1. The family of random variables (C(d), deSR} satisfying 
conditions l)-3) is called an elementary orthogonal stochastic measure 
and m[A) is its structural function. 

The orthogonality property of stochastic measures is expressed by 
condition 3); if A A 2 = 9, then the variables C(^i) and Ci^i) are ortho- 

gonal. 

It follows from the definition of m{A) that this function is non- 
negative : 

m(d)=E|C(d)p^0, m(0) = O, 

and additive: if A A 2 = 0, then 

m{A,uA2)=E\aA,) + C(A2r = 

= m{Ai) + m{A2) + 2m(A^r\A2) = m{A^)-\-m(A2). 

Thus m(d) is an elementary measure"^ on SR. Denote by the class 

of all simple functions / (x) : 

/W= E k=l,2,...,n, (2) 

fc= 1 

where n is an arbitrary number and Xa{^) is the indicator of the set A. 

We define the stochastic integral of a function /(x)e<^o{^} with 
respect to the elementary stochastic measure C as follows 

ri={ f(x)C(dx)= j] c^Ci^k)- (3) 

J k=l 



a non-negative, additive set function defined on a semi-ring. 
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Since is a semi-ring, any pair of functions in can be re- 

presented as a linear combination of indicators of the same sets in 9[R. 
Therefore if / and ge^o{^), we assume that / (x) is given by formula (2) 

n 

and g{x)= X! daA^(x), where for /c#r. 

fc =0 

It follows from the orthogonality of C that 



f{x)C{dx) g{x)C{dx)]= Y, CfeJfcm(zlfc). 



Assume that the elementary measure m satisfies the semi-additivity 
condition and therefore may be extended to a complete measure {E, £, m}. 
Then is the linear subset of the Hilbert space = 

2, mj and ^2 {®^} is the closure of if 0 in the topology generated 

by the scalar product 



(f,g) = j f(x)g(x)m(dx). 

Moreover relation (4) can be rewritten in the following form 



E J / (x) C (dx) J ^ (x) C (dx) = j f(x)g (x) m (dx) (6) 

for any pair of functions / (x) and g{x)e^Q {m}. 

We now introduce the linear span ifo {Q of the family of random 
variables {C(d), deSJl}, i.e. the set of random variables which can be 
represented in form (3) and the space if 2 {C} which is the closure of 
ifo{C} ill tho Hilbert space of random variables if2{^^, P}. Note 

that relation (3) determines the isometric correspondence ri = \j/{f) 
between ifo{®^} and ifo(0- This correspondence may be extended to 
the isometric correspondence \j/ between if 2 {StR} and if 2 {C}- If ^ = 

/ Gif 2 {SR}, we set 



^ = »A(/) = J f{^)C{dx) (7) 

by definition and call the random variable rj the stochastic integral of 
function /(x) with respect to measure C- We thus have the following 

Theorem 1. a) For a simple function (2) the value of a stochastic integral 
is given by formula (3); 

b) For any f(x) and g{x) in if 2 {E, £, m} equation (6) is valid; 



[af(x) + pg{x)']C{dx) = oc \f{x)C{dx) + p g{x)C(dx); 




§4. Stochastic Measures and Integrals 



233 



d) for an arbitrary sequence of functions /^"^(x)g =^2 
that 



\f{x)-f^\x)\^m{dx)-^0, n->oo, 



( 8 ) 



the following relation 



I 



f{x) C(^/x) = l.i.m. 



/<">(x) Cidx) 



is satisfied. 

Remark. In particular if are simple functions 

m„ 

/<"> (x) = Y, w ’ , n = 1 , 2, . . . , 

k= 1 

and (8) is satisfied then 



ff(x)adx)^li.m. Y 
J k= 1 

The existence of a sequence of simple functions which approximate 
an arbitrary function /(x)g=^ 2 {^» follows from the general theo- 
rems of measure theory. Therefore a stochastic integral can be considered 
as the m.s. limit of the corresponding integral sums. 

Denote by Lq the class of all sets Ae2, for which m{A)<co. Define 
the random set function {(^4) by 

Xa{x)C (dx ) = I c (dx) . (9) 



This function possesses the following properties: 

a) ^{A) is defined on the class of sets Lq; 

00 

b) if A„eLq; n = 0, 1, 2 ,..., Aq= U A„, Aj^nAj. = 9 for k^r, k>0, 

k= 1 

00 

r>0, then C(^o)= Z C(^n) sense of m.s. convergence. 

n=l 

c) EUA)t{B) = m{AnB), A, BeLq; 

d) C(d) = C(^)fordGaR. 

Definition 2. A random set function C satisfying conditions a), b) and 
c) is called a stochastic orthogonal measure. 

Property d) signifies that Z is an extension of an elementary stochastic 
measure C- We thus have the following 
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Theorem 2. If the structure function of an elementary stochastic measure 
C is semi-additive then C may be extended to a stochastic measure C- 

Remark. Since J^2 {C} = ^2 {C}’ we have 




f{x) l{dx). 



In accordance with this inequality, we shall henceforth identify the 
stochastic integral relative to an elementary orthogonal measure C, 
whose structure function is semi-additive, with the stochastic integral 
relative to the stochastic measure Z defined by relation ( 9 ). 

A few remarks would seem in order concerning the definition of 
a stochastic integral on a line. Let ^(t) (a^t<b) be a process with or- 
thogonal increments, i.e. 



E{at2)-ah))m4)-ah))=o 



for any [a, b), ti<t2<t^< m.s. continuous from the left: 



Set 



E|e(t)-^(s)P ^0 as stf. 



In view of the orthogonality of the increments of the process 
we have for t2>ti 

hence F{t2)^F{t^) and F(r) = limF(s). Therefore F{t) is a monotonically 

stf 

non-decreasing function continuous from the left. Let be the class 
of all half-intervals A = [ti,t2\ a^ti^t2^b, = 

m([ti, t2)) = F{t2) — F{ti). Then 50 J is a semi-ring of sets, 



EC(zli)C(^2)=w(/linzl2), 

and C (d) is an elementary orthogonal stochastic measure with a structural 
function admitting extension to a measure. Therefore one can define 
a stochastic Stieltjes integral by means of the equality 

b b 

fit) dm^^f{t)adt), 

a a 

where ^ (t) is a process with orthogonal increments. This integral exists 
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for an arbitrary Borel function / (r), re [a, b) for which 



b 



J \f {t)\^ F{dt)< CO, 

a 



where F{A) is a measure corresponding to a monotonic function F{t). 
A stochastic integral on the whole real line (—00, 00) is defined anal- 
ogously. 

We now prove several propositions concerning stochastic integrals. 
Let C be an orthogonal stochastic measure with structural function 
m which is a complete measure on {E,2} and let g{x)e^ 2 {^}- Set 

[ Xa{x) g(x) C{dx), Ae2. 



Then 



EX{A) ^B)=\ Xa{^) ZbW m{dx) = 




\g(x)\^ m{dx). 



AnB 



Introducing a new measure on £ 



1{A) = 



\g(x)\^ m{dx), 



A 



we see that X{A) is an orthogonal stochastic measure with structural 
function 1{A\ Ae2. 

Lemma 1. If /(x)ej^ 2 {^}’ 



f{x) X{dx) = 



f{x)g{x)C{dx). 



Proof. The assertion of the lemma is obvious for simple functions / (x), 

Next, if /fc(x) is a fundamental sequence of 

k 

simple functions in ^2 {O’ ^ben 



f„(x) ^dx)- 



fn + m{x) Hdx) 



2 



\f„{x)-f„+„(x)\^ l{dx) = 
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= 1 \fn{x)-fn + m{x)\^ m(dx), 

i.e. fn{x)g{x) is a fundamental sequence in Approaching the 

limit in the equality 



f„{x)X(dx) = 



f„(x)g{x)Ux) 



as n~>oo we obtain the assertion of the lemma in the general case. □ 



Lemma 2. If As Lq, then 

Proof We first note that ^(x) = 0 on the set of /-measure 0; therefore 
[0(^)] ^ 7 ^ CO (mod/). Next 

A 

Consequently we may utilize Lemma 1 and get: 

= | ^^Xa{x) g{x) C(dx) = C{A). 

The proof of lemma 2 is thus completed. □ 

Let T be a finite or infinite segment on the line, ® be the (T-algebra 
of the Lebesgue measurable subsets of T, and let / be the Lebesgue 
measure. 

Assume that the function x)is® x 2 measurable, x)e ^2 ^ 

and g(t, x)e^ 2 {^} arbitrary teT. Consider the stochastic in- 

tegral 

= ^ g(t,x)C{dx). (10) 

This integral is defined with probability 1 for each t. 

Lemma 3. The stochastic integral (10) can be defined as a function of t in 
such a manner that the process ^{t) will be measurable. 

Proof If 

g{t,x)=Y.CkXB^{t)xAM^ ( 11 ) 

i5fcGS, AkS2, then <^(i) = X ^kXs^iO C(^k) is a S x S-measurable func- 
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tion in variables {t, co), teT, coeQ. In the general case a sequence of 
simple functions g„{t, x) of type (11) can be constructed such that 



I 



\g{t, x) — g„{t, x)\^ m{dx) dt-^0 as n-^co. 



Let be a sequence of processes constructed in accordance with 
formula (10) for g = g„. Then there exists a process ^{t) such that 



— dt-^0 as n~^co 

and ^(r) is a ® X S-measurable function of (t, co). On the other hand, 
E\i{t)-i„(t)\^ dt= [ [ \g(t,x)-g„{t,x)\^m{dx) dt-^0, 



SO that E\^{t) — ^{t)\^=0 for almost all t. 
Set 



m if pm^m)=o, 

\m if p{«(t)#?w}>o. 

The process ^'(t) is measurable (since ^'{t) differs from a S x ® measur- 
able function f (t) on a set of measure 0), and is stochastically equivalent 
to ^{t). The lemma is thus proved. □ 

We shall henceforth assume that the processes determined by 
stochastic integrals of type (10) and satisfying the conditions enumerated 
above are measurable. 



Lemma 4. If g{t, ^) and h{t) are Bor el functions, 



\g{t, s)|^ dt m{ds)< oo, 



\h{t)\^ dt<oo, 



( 12 ) 



and C is an orthogonal stochastic measure on {R^, then 



D 00 

J /j(0 J g{t. 



s) C {ds) dt = 



9i(s)C{ds), 



(13) 



where 



9i 




h{t) g{t, 5) dt. 
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Proof. The mathematical expectation of the square of the absolute 
value of the integral in the l.h.s. of (13) is equal to 



jjMO 



)h{t2)( g{ti,s)g{t2,s)m(ds)]dt^dt2 = 



00 D 

III 



h{t) g{t, s) dt 



m{ds)^ 



^ \h{t)\^ dt' \g{t, s)\^ dt m{ds). 



The mathematical expectation of the square of the absolute value 
of the integral in the r.h.s. of (13) satisfies the inequality indicated in the 
second line of the last relationship. Consequently the r.h. and l.h. sides 
of equality (13) are continuous with respect to the limit transition over 
the sequences g„{t, s) converging in if 2 {^} where ^ is the direct product 
of the Lebesgue measure and the measure m in the strip [a, h] x ( — 00, 00). 
Furthermore the set of functions g{t, s) for which (13) is valid is linear 
and contains all the functions of the type ^ c^XAki^) XeM- Consequently, 
this set contains all the functions belonging to if 2 □ 



Remark. If the conditions of Lemma 4 are satisfied on each finite interval 
(a, b) and if the integral 



00 




dt— lim 

a-* — 00 
b~* + CO 



b 



a 



h(t) g(t, s) dt 



exists in the sense of ^£2 {m} -convergence, then 



00 



— CO 



CO 

I 



h{t) g{t,s)C{ds) 



dt= Ms)C{ds), 



(14) 



where 



/l(s) = 



h{t) g(t, s) dt. 



The proof follows directly from the fact that the l.h.s. of equation (14) 
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is an m.s. limit of the l.h. side of equation (13) noting that the passage to 
the limit under the sign of the stochastic integral is permissible in the 
r.h.s. of formula (13). □ 

Consider now the generalization of the previous results to the case 
of vector- valued stochastic measures. Here we shall confine ourselves 
to the simplest case of integration of scalar functions which differs only 
little from integration with respect to real- valued stochastic measures. 

Let denote a complex vector space of dimension p. For simplicity 
we shall assume that a certain basis in this space is fixed. Let there corre- 
spond to each deStR a vector- valued random variable C(^) with values 
in C(d) = {C^(d), C^(d), ..., C^(d)}. Denote by |C(^)I the norm of 
vector C(d), 

mr=t ic'(^)i'- 

k= 1 

Assume that 

1) E|C(d)|^<cx),C(0) = O; 

2) C(^iud2) = C(^i) + C(d2)(modP), if A,nA2=(»; 

3) ECM^i)C^(^2) = '^j(^ind2), 4 gSR, /=1,2; 

The family of random vectors {C(d), deSR} is called the elementary 
vector-valued stochastic (orthogonal) measure and the matrix m(A) — 
= {m^j(A)} = EC(d) C*(d) is called the structure matrix. 

Note that the matrix m(A^nA^, regarded as a function of A^ and 
A 2 , possesses properties of the correlation matrix of a vector- valued 
random function (cf. Section 1). Also if 0^2 = 0, then 

m(A^ ^ A^ = m(Ay)-\-m(A2). 

From here it follows that the diagonal elements of the matrix m(A) are 
elementary measures. Moreover it follows from inequality 

|m)(J)| < mj(A) (15) 

that 

X (16) 

r r r 

and hence the set-functions m) (k, p) are of bounded variation 

on A. 

p 

Set mo(A) = Spm{A)= Y, It follows from (16) that if 

k= 1 

wjv rriN 

Y N-^oo, then Y ^Iso. We thus obtain 

r = 1 r = 1 
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that the functions mj(A) can be extended to countably-additive set 
functions on © provided mg (A) is semi-additive on 9Jl. 

Henceforth matrix functions obtained by means of such an extension 
measure will be called positive definite matrix measures. 
from the structure function of an elementary orthogonal stochastic 
In the above, ^ denoted the completion of cr{9K} with respect to 
the extended elementary measure mg{A). For simplicity we shall retain 
the initial notations for the extensions of the functions m), m^ and the 
matrix m on © and shall assume in what follows that mg{A) is semi- 
additive on StR 

Using the formula 



n= \fix)C{dx)= f (17) 

J fc=l 

we define on the stochastic integral, where 

k=l 

The value of this integral is a random (column) vector with values in 
Denote by {(} the collection of all random vectors rj of form (17). 

n 

k= 1 

e(|/(^) C(dx) g{x) C(dx)j CfcJfcm(zlfc); 



this can be rewritten in the form 

E (I / (x) C (dx) g (x) C {dx^ ^ = I / (x) 6( (x) m (dx) . (18) 



We thus obtain equation 



I 



/ (x) C {dx) = J I / (x)| ^ mo {dx) . 
Introduce the scalar product on if q {9Ji} : 

if, fif)= J/(^) g{x) mo(dx). 



(19) 



Formula (17) establishes the isometric mapping rj = il/{f) of the space 
ifo ^0 {C} provided we define in ifg {Q the scalar product 

of the elements and rj 2 by Erj^rj^. The closure of the space of random 
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functions <^o{C} is denoted by ^ 2 {C} and the completion of ^q{WH} 
by W. 

Inequality 

I l/WI |w)| |/(x)| rri(dx) | |/(x)| mj(t/x)| , (20) 

is derived analogously to inequality (16) (here \wj\ [A) is the absolute 
variation of the function mj) first for simple functions and then using 
the limiting transition for arbitrary ©-measurable functions. Inequality 
(20) yields the existence and continuity of the integral 

j* /(x) g(x) m){dx) 

as a functional of / and g in £^2 {^o}- 

From here, an isometric correspondence rj = \l/{f) of the space 
if 0 <^0 (0 can be extended up to the isometric correspondence 

of if 2 W into i^5(C). 

The random vector rj is called the stochastic integral and denoted as 
n= [/(x)C(fix), 



where / (x) g if 2 (mo). 

A vector-valued stochastic measure C(A) is defined analogously to 
the notion of a stochastic measure in the scalar case. 



§5. Integral Representation of Random Functions 

Utilizing the results of the previous section one can obtain different 
representations of random functions using stochastic integrals. 

We first assume that a p-dimensional vector random function ^{x\ 
xe^ can be represented in the form 

i(x) = jg{x,u)C{du), ( 1 ) 

where C is a stochastic measure on a measurable space {^, ©} with 
values in and structure matrix m{A) (here the notations used in the 
previous section are retained) g{x, u) is a scalar function and moreover 
for each xe^ 

g{x, u)e^2 {^0} ^0}? mo(A) = Spm(A). 

In view of formula (18) in Section 4 the covariance matrix of the ran- 
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dom function ^ (x) is of the form 

B(xi, X 2 )= E<J(xi) (^*(X 2 ) = J g{xi,u)g(x 2 ,u)m{du), (2) 

and it follows from (19) in Section 4 that 



= J g{xj,,u)g{x2,u)mo{du). (3) 

Recall that ®, mo} - is a space with a complete measure, if 2 {^ 0 } 
is a Hilbert space of S-measurable complex-valued square mo-integrable 
functions. 

Denote by if 2 {g] the closure in if 2 {mo} of the linear span generated 
by the system of functions {g{x, u), xe^}. Then if 2 }^} becomes a 
linear closed subspace of J^ 2 (^o}- If ^2 {^} = ^^2 {^o}» the system of 
functions {^(x, u),xe^} is called complete in if 2 {^^o}. 

Let {(^(x), XE^} be a Hilbert random function with values in 
ifo {^} be the set of all random vectors 

'/= E n = l,2, x^b3C, (4) 

fc=l 

where are arbitrary complex numbers, and if 2 {{} is the closure of 
ifo{{} in the sense of mean square convergence of random vectors. 

Definition 1. The family of random vectors {rj^, (xeA}, rj^e^ 2 {^^ P} 
is called subordinated to a random function (<^(x), xe^] if r\^E^ 2 {Vi^ 
aeA. 

Theorem 1. Let the covariance matrix of a random function {{(x), xeSC] 
admit representation (2), where m is a positive definite matrix measure on 
{^, ©}, g{x, u)e ^2 {^ 0 } XG^ and the fccmily {^(x, w), xg^} is complete 
in ££ 2 ^^^ '^o}- Then <^(x) can be represented by formula (1), where 

{((^), ^G©} is a stochastic orthogonal vector -valued measure subordi- 
nated to a random function ^{x) with structural function m(-) and equality 
(1) w satisfied with probability 1 for each x. 

Proof We associate by means of relation (4) a random vector q, rj = il/{f) 
with each linear combination 

n 

f{u)=Yj Ckg{Xk,u), x^es:. (5) 

k= 1 

Denote by ££o{g] fbe set of functions (5). Define in ££o{g} fbe scalar 
product by means of relation 

(/i,/2) = j* /i(w)/2(m) mo{du). 



( 6 ) 
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The correspondence rj = il/(f) is an isometric mapping of ^o{g} into 
Hence it can be extended to an isometric mapping of ^ 2 {q) 
into III view of the completeness of the family of functions 

{g{x, u\x€^}, if Be©, then {^ 0 } = <^2 Set C(A) = i/^(xJ. 

Then C(^) is a vector- valued stochastic measure and its structure func- 
tion coincides with m: 



EC(^i)C*(^2) = 



Xa^ (x) XaSx) m(dx) = m{Ai n A 2 ). 



Now define a random function ^(x) by means of the stochastic integral 



i{x)= g{x,u)C{du). 



Since 

E(^(x) C*(^) = j* g(x, u) Xa(u) m(du), 
it follows from the isometry of the correspondence rj = il/(f) that 
(x) (x) = J g(x,u)g (x, u) m (du) . 



We thus obtain 

E|<^(x)-|'(x)|^ = 

= E(^*(x) (^(x)- E|*(x) <J(x)- E(J*(x) ?(x)+ EJ*(x) f(x)=0, 
which proves the theorem. □ 

We now present several applications of the theorem just proved. 
For brevity we shall refer to “a wide-sense stationary process” as simply 
a “stationary process” in the remainder of this section. 

In view of Theorem 2 of Section 2 in Chapter IV the correlation 
matrix of a stationary and m.s. continuous process can be represented 
in the form 

00 

— 00 

where F( ) is a non-negative definite matrix measure (the spectral 
matrix of the process). Expression (7) is a particular case of (2) in which 
functions ^(x, u) are replaced by x^t and the collection of functions 
— oo<M<cc} is complete in where Mq is an arbitrary 

bounded measure on the line. Thus Theorem 1 is applicable and the 
following result is obtained: 
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Theorem 2. A vector-valued m.s. continuous random process ^{t) 
( — 00 < / < oo) with E(^ (^) = 0 admits representation 



00 






e''%{du), 



J 

— 00 



( 8 ) 



where C(^) is a vector-valued orthogonal stochastic measure on ® sub- 
ordinated to ^ (t). There exists an isometric correspondence between ST 2 } 
and S£ 2 {i^o} ^here Fq ( * ) = Sp F( • ) such that 

a) 

b) if rji^giiu) (/= 1, 2), then 



rji= giiu)C{du) 



and 



gi{u)g2{^)F{du). 

Formula 8 is called the spectral decomposition {representation) of a 
stationary process, and the measure C (A) is the stochastic spectral mea- 
sure of the process. It follows from Theorem 2, that 



EaA,)C^(A2) = 



F{du) = F{A^nA2), 



j 

Ai r\A2 



( 9 ) 



i.e. F{') is the structure function of a vector-valued stochastic measure 

a-)- 

Remark 1 . We have E w = 0 for any n e ST 2 If I • In particular for any ^ e © 
we have EC(^) = 0. 



Remark 2. If E^(/) = a/0, then the previous theorem is applicable to 
the process ^{t) — a. On the other hand representation (8) can be retained 
in the general case also if we add to C (A) the measure of value equal to 
a, concentrated at point u = 0. 

As an example of an application of Theorem 2 we shall derive the 
Kotel’nikov-Shannon theorem for a one-dimensional random process 
whose spectral measure is concentrated on the finite interval — 

We expand the function into a Fourier series on the interval — 

We have 



^ sm(Bt — 7in) 
y ^ (ttw/b) u 



n = — 00 



Bt — nn 
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The series in the r.h. side of the last formula converges uniformly in u in 
an arbitrary segment \^ — B' , B'\ B' <B and its partial sums are bounded ; 
hence the series is convergent also in ^2 {^o }- view of the isomorphism 
between the spaces ^2 {^ 0 } have (in the sense of m.s. 

convergence) 



00 

E 

n= — 00 



sin(Bt — nn) 
Bt — nn 






nn 

~B 



(10) 



Thus, the value of a random function ^[t) at any instant t is uniquely 
recovered by means of its values in the equidistant instants of time 
nn/B, n = 0, +1, ±2. 

For stationary vector-valued sequences « = ± 1, ±2, ... one can 

formulate a theorem completely analogous to Theorem 2. The only 
difference is that the spectral measure of the sequence is concentrated 
on the half-interval [ — tt, 71) rather than on the whole real line as in the 
case of a continuous parameter process (cf. Theorem 1, Section 2). 

Theorems 1 and 2 of Section 2 yield the following generalization of 
Theorem 2 on spectral decomposition of a homogeneous m.s. continu- 
ous field. 

Theorems. A vector-valued homogeneous m.s. continuous field ^(x), 
can be represented in the form 



^ (x) = fl -h {du), a=E^ (x) , 



where C is a vector orthogonal measure on subordinated to the field 
^ (x). There exists an isometric correspondence between ^2 {^o}? 

where Fq ( * ) = Sp F( • ), such that 

a) 

b) ifni^gAu), gM)e^ 2 {Po}^ ' = 1> 2,then 

>7i= [ gi{u)^{du). 






and 



^nin*= gMg2{u)F{du). 






CoroUary. If a homogeneous (scalar) field ^(x) {with E^(x) = 0) has a 
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bounded spectrum, i.e. 






R{x)= 






Bm 

-Bm 



then this field is uniquely determined by the values at the points of the lattice 
nn^ 



x„ = 



nn^ nn^ 



z 



n^ = 0, ±1, ±2,...yby the formula: 



nn 



^ sin(B,,x'‘-nn'‘) Jnn^ nn^ .... . 
11 R B " R ’ ’’ B '* 

n = (nL fc=l —Tin \ £>2 



In this formula the summation is carried out over all possible integer- 
valued vectors n and the series in the r.h.s. of the formula is convergent in 
the m.s. for each x. □ 

Consider also the spectral decomposition of an m.s. continuous iso- 
tropic two-dimensional random field. In view of formula (10) in Section 2, 
the correlation function of the field is of the form 

00 

R(xi,X 2 ) = R{q) = ^ Jo{uQ)g{du), (12) 

0 

where x^ and X 2 are points in the plane and q is the distance between 
these points. If (r^, 0^) are the polar coordinates of the point Xj (i= 1, 2), 
then 



Q = \/ri + rl-2rir2 cos(0i-02)- 

Applying the addition formula to function Jq 

00 

Jo{uq)= Z 

k= — CO 

we rewrite formula (12) in the form 




J^uri) e"®" g(du) s{dv). 



where s{dv) is the measure concentrated at the points k = 0, +1, ±2,... 
and s{{k})= 1. In view of Theorem 1, the planar, isotropic, homogeneous 
and m.s. continuous field ^(x), x = re'^, (E<^(x) = 0) admits representation 
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in the form 



00 




Jk{ur)Ck{du), 



(13) 



where Ck is a sequence of mutually orthogonal stochastic measures on 
the line [0, oo). 



§6. Linear Transformations 

Let a system I (an apparatus or a device) be designed to transform time- 
dependent signals (functions) jc(f). The function to be transformed is 
called the function at the input (input function) of the system and the 
transformed function is called the function at the output (output func- 
tion) or the response on the input function. Mathematically any system 
is defined by a class D of “admissible” functions at the input and the 
relations of the form 

z{t) = T(x 1 1), 

where x = x(^') (— oo <5'<oo) is an input function, x(5’)eA and z{t) is the 
value of the output function at time t. The system I is called linear if 

a) the class of admissible functions D is linear 

b) the operator T satisfies the additivity principle 

T{olx^-\-Px2 I t) = aT{xi I t)-\~pT{x2 | t). 

We introduce the time-shift operation ( — oo <t< oo) by means of 
the relationship 

x^(/) = 5'^(x I /)=x(/ + t). 

This operation is defined on the set of all functions in variable t 
(— 00 </<oo) and is linear. The system E is called homogeneous in time 
(or simply homogeneous) if the class of admissible functions D is in- 
variant with respect to the shift operation 5^, S^D = D, and 

T{x^\ t)=T{x\ t-\-T) or T (S^x | t) = S^T (x\t), 

i.e. if the transformation T is commutative with the shift operation 
(— 00 <T<oo). 

The simplest example of a linear transformation is the transformation 
of the form 

00 

z{t)= f h{t, s) x{s) ds, (1) 



— 00 
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here the class of admissible functions D depends on the properties of 
the function h{t, 5). Let the input function be where is the Dirac 
function. Then z(t) = h{t, s) for t>5 and z{t) = 0 for t<s. Hence the func- 
tion h{t, s) should be interpreted as the response of the system on the 
(5-function at the instant 5. Accordingly, h{t, s) is called the impulse 
transition function of the system. If the system I is homogeneous in 
time then, formally, 

h{t, a-c)=T{d,_, I t)=T(SA I t) = S,T{d, | t) = h{t + c, a), 

or replacing aby c and t by t — c, wq have 

h(t — c, 0) = h{t, c). 

The function h{t) = h{t -^c, c) is called the impulse transition function 
of a homogeneous system. 

Thus for a homogeneous system equation 1 becomes 



z{t)= h{t — s)x{s)ds. 



( 2 ) 



The operation in the r.h.s. of relation (2) is called the convolution of 
functions h{t) and x{t). 

If the function at the input differs from the function at the output 
by the scalar factor only (i.e. the transformation T does not distort 
the form of the signal) 

T{f I t) = Xf{t) (-oo<r<oo), 

then / (t) is called the eigenfunction, and X is called the eigenvalue of the 
transformation T. The functions (where u is an arbitrary number) 
are eigenfunctions for time homogeneous systems with an integrable 
impulse transition function. Indeed, all the bounded measurable func- 
tions are admissible and 



where 



h{t — s) ds = 



h{s) ds = H{iu) e^^ 



j 

— 00 



H(iu)= h(s) e ds 



( 3 ) 



— 00 

- the Fourier transform of an impulse transition function - is an eigen- 
value of the transformation. 
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Therefore, the ratio of the system’s response on a simple harmonic 
function to this function, 



H{iu) = 



1 1 ) 



is independent of time. The function H {iu) is called the frequency charac- 
teristic of the system or the transmission coefficient. 

One can give a somewhat different interpretation of the frequency 
characteristic of system (2) by considering a different class of admissible 
functions. Let x(t) be integrable. In view of Fubini’s theorem, 

00 00 00 

\h{t — s)\ |x(s)| ds dt — 

— 00 — 00 — 00 

00 00 

\x{s)\ ds J \h{t)\ dtx GO, 

— 00 

i.e. the function z{t) is also integrable. Consider the Fourier transform 
of function z{t). Applying Fubini’s theorem we obtain 




\z{t)\ dt^ 



z(m) = 



00 

f e~'^''z{t) dt = 




{t — s)e '“"'x (5) ds dt = H (iu) x (w) , 



x(u) = 



— 00 



e '"^x(s)ds. 



Consequently, the ratio of the Fourier transform of the function at the 
output to the Fourier transform of the function at the input does not 
depend on the function at the input and is equal to its frequency charac- 
teristic 



x{u) 



The response at time t as given in formula (1) depends on the values 
of the input function at times s < t as well as at times s>t. However, in 
physical devices there is no possibility to anticipate the future. Therefore 
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in this case 

h{t,s) = 0 for t<s (4) 

Relation (4) is called the condition of physical realization of the system. 
For systems which satisfy condition (4) formula (1) becomes 



z{t) = 



h(t, s) x(s) ds, 



( 5 ) 



and if the system is homogeneous, then 

t CO 



z{t)= h{t — s)x{s)ds=\ h{s) x{t — s) ds. (6) 



If the input functions satisfies x(s) = 0 for s<0, then 

t 

z(f) = J h{t — s)x{s)ds. (7) 

0 

For these systems it is more convenient to use the Laplace transforms 

00 

z(p)==j* e~^^z{t)dt. (8) 

0 



(rather than the Fourier). 

It follows from formula (7) that 

00 

z(p) = H{p)x(p), x{p) = ^ e-”*x{t)dt (9) 

0 

for Rep ^ a if the functions and e~°"^x{t) are absolutely integrable. 

We now proceed to the basic topic of this section-linear transforma- 
tions of random processes. We consider here transformations of random 
processes which are homogeneous in time. We shall also briefly comment 
on a more general case. 

Let ^(t) be a measurable Hilbert process (— oo<t<oo) with cov- 
ariance B(t, s) where B(f t) is integrable with respect to t in each finite 
interval and also let the function \h{t, s)p be integrable for fixed t. Then 
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with probability 1 there exists for each a and b the integral 

b 

a 

Define the improper integral from — oo to +oo as the m.s. limit of 
integrals over finite intervals of integration: 

ao b 




ds= l.i.m. 

a-*^ — 00 



— 00 



b~* + 00 



h(t, s) ^(s) ds. 



In order that this limit exist it is necessary and sufficient that the integral 

00 00 

f f 

h{t, Si) B{si, S 2 ) h{t, S 2 ) dsi ds2 

— 00 — 00 

exists as an improper Cauchy integral on the plane. If this integral exists 
for teT, then C(t) becomes a Hilbert random process on T with the 
covariance 



00 00 




— 00 — 00 



h{ti, Si) B(si, Sj) h(t2, S 2 ) dsi 



ds2 • 



( 10 ) 



Assume now that ^{t) is a wide-sense stationary process with the 
spectral measure F{du) and E^{t) = 0. This assumption remains valid 
until the end of this section. The integral 



J 

— 00 



h(t — s) ^(s) ds 



exists (in the sense defined above) if and only if the integral 



(11) 



h{t — Si) R{si—S2) /i(t — S2) dsi ds2 = 



— 00 — 00 



I 



00 

f h{si) R{s2 — Si) h{s2) dsi ds2 



— 00 — 00 
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exists where R{t) is the correlation function of the process. For this to 
hold it is, in turn, sufficient that the function h{t) be absolutely integrable 
on (—00, oo). In this case, using the spectral representation of the cor- 
relation function R{t) we obtain the following expression for the correla- 
tion function R^{t, t) of the process rj{t): 



/z(ti-Si) 1^(51-52) h(t2-S2) ds^ dS2 = 

— 00 ~ CO 
00 00 00 

e'^^^^~^^^h{t2 — S2) dsi ds2F{du)=^ 





= {iu)\ ^ F {du) = R, (t 1 - 1 2 ) . 



Thus the process rj (t) is also a wide-sense stationary process. 

Definition 1. For a given process ^(/), the transformation T is called an 
admissible filter (or simply a filter) if it is defined by formula (1 1), where 
h{t) is an absolutely integrable function on (—00,00) and square inte- 
grable on any finite interval, or if T is the m.s. limit of sequences of such 
transformations (in ^2 (O)- 

The following relationship stipulates the condition for convergence 
of transformations (1 1) f/„(r)= T„((^ | /) with impulse transition functions 
h^{t) and frequency characteristics H^{iu)\ 

00 

1 \H„(iu)-H^{iu)\^ F(du)-^0 (12) 

- 00 

as n, m-^oo. 

This means that the sequence H^{iu) is fundamental in ^ 2 {F}- In this 
case, however, the limit H(iu) = \.i.m.H„(iu) (in ^ 2 {F}) exists which is 
called the frequency characteristic of the limiting filter and if r]{t) = 
= l.i.m.^„(t), then 



R,(t)= 



e“'‘\H(iu)\^ F{du). 



(13) 



Conversely, any function H(iu)e ^2 can be approximated in the 
sense of convergence in {F} by means of functions which are Fourier 
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transforms of absolutely integrable functions. Thus it is convenient 
to define filters by means of their frequency characteristics. 

Theorem 1. In order that the function be the frequency character- 
istic of an admissible filter for the process ^ (^) with spectral measure F it 
is necessary and sujficient that ^(/w)gJ ^2 The correlation function 
of the process at the output of the filter with frequency characteristic H{iii) 
is given by formula (13). 

Recalling the power interpretation of the spectral function we observe 
from (13) that \H{iii)\^ represents the amount of increase in the power 
of simple harmonic components of the process - when passing through 
the filter - with frequencies in the interval (w, u + du). 

Theorem 2. If the process ^ (t) at the input of a filter with spectral charac- 
teristic H[iu) possesses the spectral representation 



at)= 



C(du), 



(14) 



— 00 



then the process rj{t) at the output of the filter will be of the form 

00 

e‘“‘ H{iu)C(du). (15) 

— 00 




Indeed if the filter possesses an absolutely integrable impulse transfer 
function then 

00 00 



r]{t)= h{t — s)^ (s) ds = H (iu) C (du ) . 



The proof in the general case is obtained by means of the limiting 
transition over the sequences H„{iu) converging to H(iu) in J ^2 □ 

Let rj}^{t) be the process at the output of the filter with the frequency 
characteristic Hj^(iu), Erjj^{t) = 0 (/c= 1, 2). We find the mutual correlation 
function of the processes rji{t) and rj 2 {t)- It follows directly from the 
isomorphisms of the spaces J ^2 {C} and ^2 {T} that 



Ri2{t)= Ef/i(t + 5) ri2{s)= J H,{iu) H 2 H F{du). (16) 

— 00 

We now present several examples of filters and their frequency 
characteristics. 
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1. A band-pass filter passes through only those harmonic com- 
ponents of a process with frequencies in the given range {a, b). The 
frequency characteristic of the process is equal to H(iu) — X(a,b){^) 
the filter is admissible for an arbitrary process. The impulse transfer 
function is obtained by means of the Fourier formula 



h{t )= — I du = 
2n ' 



^ibt 



2nit 



2. A high-pass filter which supresses the low frequencies without 
altering the high ones. Its frequency characteristic H{iu) = X(\u\>a){^) 
and the impulse transfer function does not exist. 

3. Consider the operation of the m.s. differentiation of a wide-sense 
stationary process. In order that the m.s. derivative of the process ^{t) 
exist, it is sufficient to require the existence of R"{0) (Section 3, Corollary 
1). This condition is equivalent to requirement 

00 

J u^F (du)< CO . (17) 

— 00 



(cf. Theorem 4, Section 1, Chapter I). 

On the other hand, if this condition is satisfied, then for 

— HU (in if 2 {F}) 

and it is permissible to pass to the limit as h-^0 under the sign of the 
stochastic integral in relation 

h J h 

— 00 

Consequently, 






00 



J 

— 00 



iuC(du). 



(18) 



Thus a filter with the frequency characteristic iu, which is admissible 
for all the stationary processes satisfying condition (17) corresponds to 
the differentiation operation. The impulse transfer function does not 
exist, but the filter can be considered as a limiting filter (as 6->0) of a 
collection of filters with impulse transition functions h^{t) = 0 for \t\'^s 
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sgn t 

and K{t)= Y~ 1^1 with the corresponding frequency charac- 



us 



4 sin^ — 



teristics — - 



IU£ 



4. The shift operation. Since 

00 

^(t + s)= I e‘'^ e‘’^C{du), 



it follows that the frequency characteristic = corresponds to 
the shift operation 7], 7^((^ | t) = ^{t + s). The impulse transition function 
does not exist. 

5. Differential equations. Consider a filter defined by a linear differ- 
ential equation with constant coefficients 

Lrj = Mi (19) 



where 



^ — bo 

® dr ^ df 



t 

^m-l 









Equation (19) is meaningful only if process {(t) is m times m.s. differ- 
entiable. We then seek an n times m.s. differentiable stationary process 
r](t\ satisfying (19). Assume that (19) admits a stationary solution. It 
can be represented in the form 




H{iu) C{du). 



Applying operations M and L to processes ^{t) and rj{t) respectively, 
we obtain: 



L{iu) H(iu) C(rfi 



00 

'«)= I e‘“ 



M(iu) C{du), 



where L{iu)= ^ ^^(iw)” \ M{iu)= Y, Therefore, if L{iu) has 

k=0 k=0 
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no real roots, we have 



H{iu) = 



M {iu) 
L{iu) ’ 



( 20 ) 



Conversely, if process ^(t) is m times m.s. differentiable M{iu)e ^2 in 
L (iu) 7 ^ 0 ( — oo<w<oo), then process 



00 




— 00 



is n times m.s. differentiable and satisfies equation (19). Therefore under 
the condition M(iu)e^ 2 (F\ L(iu)i^0 there exists a unique filter corre- 
sponding to the differential equation (19). Note, however, that the solu- 
tion of equation (19) can be determined in more general cases. Assume 
that L(iu) has no real roots. The filter with the frequency characteristic 
M(iu)/L(iu) exists even if M(iw)^J ^2 h is sufficient to require here 

only that The latter holds always if the degree n of the 

L[iu) 

polynomial is greater or equal to m. Therefore for a filter with the 
frequency characteristic (20) with a non-vanishing denominator for real u 
is admissible for an arbitrary process at the input, while the process at 
the output is identified with the stationary solution of equation (19). 
Considering, as before, only those dilferential equations for which poly- 
nomial L (x) has no purely imaginary roots, we extract from the rational 
function M(x)/L(x) its integral part (P(x) which is non-zero if m^n) 
and expand the remainder into a partial fraction expansion. We thus 
obtain 



M(iu) 

L(iu) 



n' 



— P(iu)-\- Y, 

k= 1 



lie 



E 



^ks 

{iu-p'^y 



n" I'k 



+ E I 

t= 1 1 



c 



n 

ks 



(iu-plY 



where P(iu)= Y (m^n) and P(iu) = 0 (m<n), 

k = 0 

Rep^ < 0 and Re p'^ > 0, and p^ are the roots of the polynomials L(x) = 0. 
Since 



1 _ 1 f 

{iu-py~(s-l)\^ ^ 



e 



00 



dt- 



(E=T)! 



e 



dt 



(Re p < 0) 



0 



0 
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and 



{iu-pY 



(5-1)! 



e'”e-''“dt (Rep>0), 



the process at the output of the filter can be represented as 



^{t)= Z i{t-'t)Gi{x)dx+ ^{i + t) G 2 (-x)dx, 

k=0 J J 

0 0 



where 



We note that if the polynomial L(x) has roots with a positive real part, 
then the corresponding filter can not be physically realized in practice. 



§7. Physically Realizable FUters 

In the present section the following problem is investigated : what are the 
spectral functions that can be obtained at the output of a physically real- 
izable filter ? Here at the input of the filter the simplest (in a certain sense) 
random process is considered. 

The processes discussed in this section are always assumed to be one- 
dimensional and wide-stationary. Therefore the word “stationary” will 
sometimes be omitted, while the word “wide” is always omitted. 

We start with stationary sequences. We shall not restate all the defi- 
nitions and heuristic considerations given for continuous-parameter 
processes for the sequences, but we shall utilize the corresponding ter- 
minology. Consider a system such that the states at the input and output 
register only at the integer- valued instants of time t = 0, ±1, +2, ... . 

Let a unit impulse enter the system at time 0. The system’s response 
at time / is denoted by a^. If the system does not anticipate the future, 
a, = 0 for ^<0. If the system is homogeneous in time, its response to the 
unit impulse applied at time s is equal to The response at time t of 
a linear, homogeneous and physically realizable system to the sequence 
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of impulses ^{n) {—co <n<co) will be 

t 00 

»?(0= Z E (1) 

n= — oo n=0 

i.e. is a process with moving averages. 

Assume that ^ [n) is a standard uncorrelated sequence 

E^(«)=0, 

(— 00 <«, m<oo). 

This sequence has a constant spectral density. 

In order that series (1) be m.s. convergent, it is necessary and sufficient 
that 



Z |a„|^<oo. ( 2 ) 

n=0 

If the condition is satisfied then the process r\ (/) is also a wide-stationary 
one and 



Er]{t) = 0, R^{t)= X (3) 

n=0 

What kind of sequences can be obtained in this manner? 

Lemma 1. In order that a stationary sequence rj{n) be a response of a 
physically realizable filter to an uncorrelated sequence, it is necessary and 
sufficient that the sequence rj{n) have an absolutely continuous spectral 
measure and its spectral density f{u) admit representation 

00 00 

/(M) = |0(e'“)|^ Z V™ Z (4) 

n=0 n=0 

Proof Necessity. Let the sequence be represented by (1). Set 

1 

0(c‘")=— X (5) 

yjlll 

In view of Parseval’s equality 

7T 

^,(0= Z «n+A= [ du, 

n = 0 J 



i.e. the sequence rj(n) possesses an absolutely continuous spectrum with 
density /(«) = I 
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Sufficiency. Let rj{n) be a sequence with the correlation function 



^,(0= e'"‘f{u)du 



and /(m) = |0(c‘")|^, where gf(e‘“) is determined by relation (4). The se- 
quence r] (n) has the spectral representation 

n 

»/(n)=j' e‘"“C(dw). 

— n 

We construct the following stochastic measure on the cr-algebra of Borel 
subsets of the interval [ — 71,71:): 



Then 



^(^)= I ^ XA{u)C(du). 









J du, 

AnB 



i.e. ^{A) is an orthogonal measure with structure function l{AnB), 
where I is the Lebesgue measure. Using Lemma 2 and 1 of Section 4, 
we obtain 

K 71 

rj{n)= f f = 



— K 



— 7t 



ou 



J{n-k)u 



k = 0 



^{du)= ^ a^^{n-k). 



k=0 



where 



It 




m= 



e‘“" ^ {du) 



and 



E^(n)^(m) 



1 r ■ 

=— e*' 

2n J 






Thus ^ (n) is a standard uncorrelated sequence. □ 
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The lemma just proved gives us a simple answer to the question 
posed above. But this answer is not sufficiently effective in the general 
case, since it is still unclear when the spectral density can be represented 
by formula (4). 

We now obtain the conditions on f{u) required to admit such a 
representation. Denote by H 2 the set of all functions f{z) analytical 
in the circle D = {z:\z\<l} and such that 



\\f{z)\\^ = lim 

rtl 



\f{re^^)\^ d9<oo. 



then f[re'^)=Y, are the Fourier 



n = 0 



coefficients of the function/ {re'^). In view of Parseval’s theorem 



\f{re‘^)\^ de = 2n ^ \a„\- 



It thus follows that f{z)eH 2 if and only if 



z 



n = 0 



\af«x>. 



Consequently, one can define for each function f {z)eH 2 & series/ (e'®)= 

00 

= cin which is convergent in ^2 (0 where / is the Lebesgue measure 
«=o 

on [ — 7T, 7i). The function / (z) (|z| < 1) can be determined from the function 
/ (e^^) by means of Poisson’s formula 



/(rO 



1 

2n 



P(r, 6, u) du, 



( 6 ) 



where 



P(r, e,u) = 



1-r^ 

1 — 2r cos(0 — u) + r 



:= Z 



^\n\ ^in{d-u) 



The proof of this assertion follows directly from Parseval’s theorem. □ 
It is proved in the theory of complex functions (cf. Privalov [46])*, 
that if in formula (6) the function / (c'®) is Lebesgue integrable, then for 



* See also P. L. Duren, Theory of Spaces, Academic Press, N.Y., 1970 (p. 5, Corollary 2). 
Translator’s Remark. 
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almost all 6 there exists the limit 

rU 

The function f{e'^) is called the boundary value of function /(z) (|z| < 1). 

Theorem 1. Let f[u) be a non-negative and Lebesgue integrable function 
on [— 7T, n). For the existence of the function g{z)eH 2 such that 

m=w^)\\ ( 7 ) 

it is necessary and sufficient that 

n 

f |ln / (u)| du<co- (8) 



Proof. Necessity: Let g{z)= ^ a„z"eH 2 and let (7) be valid. It can be 



n = 0 



assumed that ^(0)^0 (otherwise the function z~"*^(z) may be taken 
in place of g{z), where m is the multiplicity of the zero at z = 0 of the 
function g{z)) and that ^(0)=1. Let 0<r<l and A = {u:\g{re''*)\^l}, 
5={M:|^(r^'“)|>l}. Then 



|ln|^(re'“)| I du = 



ln|^(re*“)| du- 



B 



\n\g{re^'*)\ du = 2 j \n\g(re"'*)\ du— | ln|^(r^'“)| rfw. 

A B 

It follows from Jensen’s formula that 



TL 

-f 

271 J 



ln|/(re‘“)| du=\n f] 7 — 7 ^^’ 

fc=l \^k\ 



where Zj^ are the zeros of function / (z) in the interior of circle |z| <r and 
1/ (0)1 = 1. Consequently, 



I In I ^ (re'“)| I t/u < 2 J In 1^ (r^'“)| dw ^ J \g (r^'“)| ^ du^ 

B B 
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^ \g{re^^)\^ du^2n Y, kl^- 



Applying Fatou’s lemma, we obtain 



I In \g (^*“)| I = lim | In \g {re'^)\ | du 

J 1 1 






< lim |ln|6f(re‘“)|| dM<27i ^ \a„\^, 
»• T 1 J « = 0 



which proves the necessity of condition (8). 

Sufficiency. Let condition (8) be satisfied. The function 



n 

w(r, 0)=^ I In / (u) P(r, 0, u) du 
2n J 



is harmonic in the circle D = {z:\z\<l}. We note that the Jensen inequality 
yields 



w(r, 6):^ In 



1 

2n 



/(u)P(r, 0, u)du\. 



Denote the analytic function in D with the real part w(r, 6) by (p(z). Set 
Then 



\g {re^^)\^ = 



1 

2n 



f (u) P{r, 0, u) du 



and 



\g {re^^)\^ < \f (u)\ du<co. 



Thus g{z)eH 2 and lim \g{re' 






lim u (r, 6) 



rn 



= f (6) almost everywhere. The 



theorem is proved. □ 
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Remark 1. It follows from the proof of the theorem that function g{z) can 
be chosen in such a manner that it will be positive for z = 0 and have no 
zeros in D. 



Remark 2. Function g{z) whose existence was established in Theorem 1 
is not uniquely determined. However, if g{z) satisfies conditions 

a) g(z)¥=0, zeD, b) ^'(0)>0, 



then this function is unique and therefore coincides with the one obtained 
in the theorem. 



Indeed, if ^^(z) (i= 1, 2) be two such functions, then il/{z) = 



dziz) 



IS an- 



alytic in D and is non-vanishing and its absolute value is one on the 
boundary of D. The function lni/^(z) will be analytic in D and its real part 
will be zero on the boundary of D. Therefore lni/^(z) = ik, where /c is a real 
number. Since \n(p{0) is real it follows that lni/^(z) = 0. □ 

Combining Lemma 1 with Theorem 1 we obtain the following asser- 
tion. 



Theorem 2. In order that the sequence rj(t) admit representation 



»/W= E E l«nl^<00, 



«=0 



where ^[n) is an uncorrelated sequence it is necessary and sufficient that 
f](t) possess an absolutely continuous spectral measure and its spectral 
density satisfy condition 



J \nf{u)du>—co. 

— n 

Let C 1 W, C 2 W? be two Hilbert random functions. Denote by 
the closed linear span of the system of random variables {CiW, 
xek} in J^ 2 - 

Definition 1. If £^2 (Ci) ^ ^2 {Ci)^ then the random function Ci W is called 
subordinated to C 2 W- If? however, J^ 2 (Ci) = <^ 2 (C 2 )? then Ci(^) and C 2 W 
are called equivalent. 

Remark 1, It follows from the proof of Lemma 1 that the sequences ^{n) 
and rj{n) are equivalent. 

We now show how to express the coefficients in the operation of 
moving averages in terms of the spectral density f{u) of the sequence ^(^). 
The function (p[z) introduced in the course of the proof of Theorem 1 
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is analytic in D and its real part has the boundary value In f(u). Hence 
using Schwarz’s formula 



(p{z) = 



2n 



~h Z 

In / {u) — du . 



( 9 ) 



Expanding the function g{z) = Qxp{j(p{z)} into the power series g{z) = 

00 

Y, b^z^ we obtain the following values for the coefficients 

n=0 




On the other hand the expression for g (z) can be transformed as follows. 
Since 



+ z 2ze 
= 1 -+-- 



e —z 



1 —ze 



-=i+2 y 

-lU 4L-f ’ 



k=l 



it follows that 



f 1 f 1 ” ] 

0 (z)=exp<— In f{u)du+~ ^ 

(,4tc J 2nk=i J 



where 

n 

— n 

Setting 

It 

^=exp{^ I ln/(M)du}, exp|^ 4 z*| = ^X c^z* (co = l), 

we obtain 

di^) = P Z 

k=0 

Hence 

a„ = y/^Pc„. ( 10 ) 

We now proceed to continuous-parameter processes. The operation 
which corresponds to the random process ^{t\ a process rj{t\ te( — oo, oo) 
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determined by the formula 

00 

= (11) 

0 

may serve as a generalization of the operation of moving averages to the 
case of continuous-parameter random processes. 

A process with orthogonal increments ^{t) will be called standard if 

= E\^t + h)-m" = h^ 

In view of the remarks in Section 4 concerning Stieltjes’ stochastic 
integrals, a certain stochastic orthogonal measure i{A), defined on the 
(7-algebra of Lebesgue measurable sets, is associated with process ^{t). 
This measure is also called the standard stochastic measure. A necessary 
and sufficient condition for the existence of integral (11) is that a(t) be 
Lebesgue measurable and 



J \a{t)\^ dt < CO . 

0 

Note that the standard process ^{t) is not m.s. differentiable. However the 
quotients 



Utk)= 






^ — ^k+ 1 



are orthogonal for all t^ and for A arbitrarily small. Therefore the fictitious 
derivative (t) should be regarded as a process whose values in any two 
instants of time are orthogonal and their variance is infinite. This fictitious 
process is often utilized in arguments and is called "‘white noise'\ A precise 
definition of white noise is given in the theory of generalized random 
processes (Gel’fand and Vilenkin [10]). Symbolically formula (11) can be 
written as 



rj{t)= a{t — s) (^'(s) ds. 



where rj (t) is interpreted as the response of a physically realizable filter to 
“white noise”. The impulse transfer function of this filter equals zero for 
t<0 and equals a{t) for t>0. Note that the admissible physically reali- 
zable filters for the process ^'{t) are all given by formula (11). Indeed, any 
admissible physically realizable filter is by definition either of the form 
(1 1) or is a limit of filters of this form. The condition for m.s. convergence 
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of filters of type (11) with the impulse transition functions a„{t) is as 
follows: 



J \a„{s) — a„>{s)\^ds-^0, as n, n'-^oo. 

0 

If, however, this condition is satisfied, \.hm.a„{t) = a{t) (with respect 
to the Lebesgue measure on (0, oo)) exists and 



00 00 

l.i.m.^„(t) = l.i.m. J a„(s) d^{t — s) = ^ a(s) d^(t- 



■s). 



Thus, the limiting transition in filters of type (1 1) does not extend the class 
of filters. 

Formula (11) can be rewritten as follows: 



rj{t)= j a{t — s)d^{s% a{t) = 0 for t<0. 

— 00 

Hence the correlation function of the process rj (t) is equal to 



R{t)= s — u) a{s — u) du 



or 



R(t) = 



a{t-\-s) a{s) ds. 



0 



(12) 



Lemma 2. In order that a wide-sense stationary process rj(t) be a response 
of a physically realizable filter to “white noise“ subordinated to this pro- 
cess it is necessary and sujficient that the process rj{t) possess an absolutely 
continuous spectral measure and that its spectral density f{u) admit repre- 
sentation 



where 



f{u) = \h{iu)\^ , 




0 



00 



J \b{s)\^ ds<QO . 



(13) 



0 



(14) 
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Proof. Necessity. Let the process rj{t) admit representation (11). Set 



h(iu) = a(s) e 



In view of Parseval’s equality 



00 00 

= J a{t + s) a(s) ds= J e'^^\h{iu)\^ du, 



i.e. the spectrum of the process is absolutely continuous and the spectral 
density is of the form (13), (14). 

Sufficiency. Let the conditions of the lemma be satisfied. Consider the 
spectral representation of the process 

00 

f/(t)= j* e'"'C(du),. 

— 00 

and stochastic measure 



H{A) 



1 ^ 

J h{iu) 



C{du). 



(15) 



The stochastic integral (15) is meaningful for an arbitrary bounded 

Borel set A since where F is the spectral measure of the 

h(iu) 



process, F{A) 



= \h{ii 

•> 



m)P du. 



It is easy to verify that ill{A) is an orthogonal measure and moreover 



Efi{A) du. 



Set 









- lU 



lii{du). 



(16) 



Obviously the stochastic integral (16) exists. The random function of the 
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interval ^(A) = ^{t 2 ) — i{ti), = 12 ) is an elementary measure which 

corresponds to the standard process. Indeed, E^{A) = 0. Next, utilizing 
Parseval’s equality for Fourier integrals, we obtain 






00 

=± f 

27T J 



y - iut2 _ ^ ~ ^iut 4 _ ^iut3 



du = 



-lU 



lU 




X^^(t)dt = l{AinA2), 



where A^=[^ti, 12 ), d 2 = [^ 3 ? I is the Lebesgue measure on the 

real line. In view of Lemma 1 in Section 4 and formula (15) we obtain 



00 



rj{t)= (iu) ji {du ) . 



Now we note that if 
h{i 



1 ^) = — !_ I a (s) ds, where \a (s)| ^ ds<co, 

^ j J 



(17) 



(18) 



then 



CO 

u)= J a( — *5 



h (iu) n (du) = a( — s)^ (ds) 



Indeed, since the spaces ^ 2 ( 1 ^) ^ 2 {Q isomorphic to the space 

J ^2 {O’ where / is the Lebesgue measure on the whole real line (— 00 , 00 ) 
and the Fourier transform leaves the scalar product invariant in ^2 (O’ 
it is sufficient to verify formula (18) for simple functions. Let a(t) = 
= Z where is the interval (or semi-interval) (a^, h^). Then 

00 00 

'* I r _ giwflk 

a(-s)^(ds) = —^ ^ fi{du), 

J J “ 

— 00 ^ — 00 



which is a particular case of (18). Thus formula (18) is established. It 
follows from (18) that 



00 



— 00 



h(iu) jj,(du) = 




— 00 



(19) 
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since, in view of formula (16), the multiplication of measure ^ by 
yields the shift in the argument of function ^ by the amount t. We obtain 
from (17) and (19) that 



)-ja 



rj{t)=\ a{s)d^{t — s), where 



a(t)=-j=z j 



h{iu) e du. 



The lemma is thus proved, □ 

Let the spectral density / (m) of a process rj{t) be given. The following 
questions arise. When does the spectral density admit representation 
(13) (14), (or as it is commonly called factorization)? How can we find 
the function h{iu% given the function / (u) (or analogously, how can one 
find function a{t))l Answers to these problems can be obtained by re- 
dusing them to the case of factorizations of functions on a circle - a 
problem which has already been solved. We introduce the transformation 
1 “h Z 

w = - which maps the circle D = {z:|z|<l} into the right-hand half- 

plane = {w: Re w>0}. On the boundary of the corresponding regions 
{w = iu,z = e"^) this transformation is of the form u = ctg0/2. Let f{u) 
admit factorization (13), (14). Set 

7(0)=/(m)(1+m^). 

The function /(•) admits factorization \f {6)\ = \g(e^^)\^ where g(z) is 
analytic in D and integrable in ( — tt, tt). 



00 

r(e)d9=2 j f(u 



)du< CO, 
J 

- 00 

i.e. g{z)eH 2 . In view of Theorem 1 

-»<i 

— n — 00 

hence 

lnf{u) 



l+u^ 



du> —oo . 



( 21 ) 
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Assume the converse. Let / (u) be non-negative, integrable and satisfy 
relation (21). Define /(*) by means of (20). Then /(•) is integrable and 

n 

f \nj{9)d0> —00. 



It follows from Theorem 1 that /(•) admits factorization 



0 n = 0 

Set 

= - Z 7T— • 

l+W„ = o \1 + W/ 

The function /i(-) is analytic in the right-hand half-plane and/(w) = 

= \h(iu)\^, 

(1 — inf 

n = 0 '(1+iw)' 

Recalling that functions form a complete orthonormal sequence 

^/in 



h(iu)= Z _ 



( 22 ) 



1 {l-iuf 

,\n+ 1 



IS a 



in ^2(~7 t, n) it is easy to verify that the sequence , . 

complete orthonormal sequence in <^2 (—^ 5 ^) with respect to the 
Lebesgue measure. Therefore the series for h{iu) written above (eq. (22)) 
is m.s. convergent. We now note that 



A, A, 



V ^ — V 



(1 + m)"'^^ k=i(l+iu)^ k=i(k—l)\ 






= f e~‘'“ B„{t)dt, 



0 

so that 



h(iu) 



e '“^fo(t)dt, where 



0 



00 



n = 0 



One should keep in mind that the partial sums of the series (22) are 
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Fourier transforms (up to a multiplicative factor) of functions equal to 

N 

Y,a„B„(t) for and equal to zero for t < 0. Since the Fourier transform 
1 

does not change the norm of functions in 

OO 

vergence of the series for b{t) and the fact that J \b{t)\^ dt<co follow 

0 

from the m.s. convergence of series (22). As far as the uniqueness of 
the factorization obtained of the function / {u) is concerned, the situation 
is analogous to the case of factorization of functions on a circle. The 
expression for h{w) will be obtained from formula (9) by writing w in 
place of z and performing the corresponding change of variables under 
the integral sign : 



/i(w) = exp 



1 

2n 



00 

ln/(w) i + uw 1 

^ du> 

J l-j-u u + iw J 

— 00 



(23) 



Theorems. In order that a non-negative integrable function f{u) 
{ — GO <u<cc) admit factorization (13), (14) it is necessary and sufficient 
that 



00 




— 00 



(24) 



Under additional assumptions that h{w)^0 (Rew>0), /z(l)>0, the 
function h(w) is unique and is defined by formula (23). 

Theorem 4. In order that a stationary process rj(t) (—co<t<co) admit 
representation (1 1) zY is necessary and sufficient that it possess an absolutely 
continuous spectrum and that its spectral density satisfies condition (24). 



§8. Forecasting and Filtering Stationary Processes 

One of the important problems in the theory of random processes which 
has numerous practical applications is as follows: it is required to es- 
timate in the best possible manner the value of a random variable C by 
observing a certain set of random variables a e A}. Thus it is required 

to find a function oceA) of variables (xeA which satisfies the 
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approximate equality 

C«C=/(^Jae^). (1) 

with the least possible error. 

An example of such a problem is the forecasting (or extrapolation) 
of a random process. Here it is required to predict the value of a random 
process at time /* by means of its values on a set of instants of time 
preceding t*. 

Another example is the problem of filtering a random process. The 
problem is as follows: at times t'eT'czT the process ^(t) = r]{t) + l^{t) 
consisting of the “useful” signal C(r) and the additive “noise” rj{t) was 
observed. It is required to separate the noise from the signal, i.e. it is 
required to obtain for some t* a T the best approximation to C(0 ^>f the 
form 



The problem stated is not completely defined since the meaning of “the 
best approximation” is not clear. Evidently the optimality criterion de- 
pends on the practical nature of the problem under consideration. As 
far as the mathematical theory is concerned the most advanced solutions 
of the problem are based on choosing the mean square deviation as the 
measure of precision of the approximate equality (1). 

Quantity 

d={Ein-fmaBA)fyi^ ( 2 ) 

is called the mean square error of the approximate formula (1). The 
problem is to determine a function / such that (2) admits the minimal 
value. In the case when .4 is a finite set, f(^^\oLeA) represents a Borel 
measurable function of arguments ae^. If, however, A is infinite, then 
this symbol represents a random variable measurable with respect to the 
(7-algebra g==(7{c^^, oteA] generated by the set of random variables {(^^, 
aeA}. 

Henceforth it will be assumed that both C and f{^^,oceA) possess 
moments of the second order. 

Set 

7=E(C|g). (3) 

Then 

5^ = E{i:-f(^ycceA)y = 

= E(c-yf + 2E(C-y) (r-/(^J «€^))+ E(7-/(^J 

Since | oceA) is g-measurable. 
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= EE{(C-y)(y-/(^Jcc6^))|g} = 

= E(y-/(^,|ae^))E{(C-y)|5}=0- 

Hence, 

(5^ = E (C - 7)2 +E (7-/(4 |ae^))2. 

We thus obtain 

Theorem 1. An approximation to a random variable { possessing finite 
second order moments - with minimal mean square error - by means of a 
^ = a{^^,aieA)-measurable random variable is unique (mod P) and is 
given by formula 

y=E{c|g}. 

Remark. The estimator C = 7 of the random variable C is unbiased, i.e. 

Ey=EE{C|g} = EC 

and the variables C~ y and are uncorrelated for any oleA: 

E(C-7) ^.= EE{(C-7) 4|g} = E4E{(C-7)|5)=0. 

Unfortunately practical application of Theorem 1 is often very dif- 
ficult. In case of Gaussian random variables one may proceed, however, 
one step further. First we note that a simpler problem leading in many 
cases to complete and analytically accessible solutions, is that of obtain- 
ing an optimal approximation not in the class of all measurable functions 
of given random variables but in a narrower class of linear functions. 
More precisely this means the following. Let {Q, S, P} be the basic 
probability space. Assume that the variables and ( possess finite mo- 
ments of the second order. We introduce the subspace <^2 c>f 

the Hilbert space ^2 P} which is a closed linear span of the vari- 

ables ae A and a constant. One may regard the subspace ^2 
as the set of all linear (nonhomogeneous) functions of with finite 
variances. The best linear approximation ^ of random variable C is the 
element of ^2 oceA} which is closest to C, i e. 

^2 = e|C-CI"<E|C'-C|2 

for any oceAj. We know from the theory of Hilbert spaces 

that the problem of determining an element C belonging to subspace Hq 
which is closest to the given element C always has a unique solution. 
Namely C is the projection of C onto Nq. The element C can always be 
determined and moreover uniquely determined from the system of equa- 
tions (C — C, C") = 0 for any our case this system of 

equations reduces to equation 

E(C4)=E(C4), 



(4) 
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and since the variable C = 1 belongs to J ^2 cc^A} 

EC=EC, 



so that the optimal linear estimators C are necessarily unbiased. We may 
assume that E(^^ = 0 for any a. Therefore in what follows we shall confine 
ourselves to the subspace of the random variables in if 2 P} 

with zero mathematical expectations. 

Obviously, linear estimators of C may not always be suitable. For 
example, if = where v is uniformly distributed on ( — tt, tt). 



then E{^{n) ^(m)}=0 (for n^m) and the best linear estimator of the 



variable f(m) in terms of the values of all ^{n) (n^m) is of the form 
^(m)=0, i.e. it does not utilize the values of the variables while 
an arbitrary pair of observations ^ (fe) and (fe + 1) is suflScient to determine 



exactly the whole sequence ^{n), namely {(n) = 



^(fe+1)' 

. m . 



n-k 

I m- 



We now assume that the finite dimensional distributions of the 



system (C, ae^} are normal and E^^ = 0, EC = 0. In this case the fact 
that variables C — ^ and are uncorrelated implies that they are indepen- 
dent. Therefore C — C does not depend on the d-algebra g and 



E{ci g}=E{c-r+n m=E(c-D+r=?. 

Theorem 2. Given a system of Gaussian random variables {C, oceA}, 
the best estimator ( in the mean-square sense) of variable by means of a 
(xe A}-measurable function coincides with the best linear estimator 
in aeA}. 

Below a number of specific problems of construction of optimal 
linear estimators are examined. 

A) The number of random variables is finite (a= 1, 2,..., n). The 
solution of this problem is simple and well known from linear algebra. 
Assuming that are linearly independent, one may represent the pro- 
jection C of the variable C onto the finite dimensional space Hq spanned 
by the variables (a = 1, ...,«) by means of formula 

... 



( 4,^0 ... 

... iC,Q 0 

where r=r(^i, (^ 2 » - ^n) is the determinant of the Gram matrix con- 
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structed from the vectors 



















and (<^, rf)= E{^fj), The mean square error d of the approximate equality 
C C is equal to the length of the perpendicular dropped from the end 
of the vector ^ onto the space Hq and is given by the formula 



B) Consider the problem of estimating the random variable C by 
means of the results of observations on an m.s. continuous random 
process ^{t) during a finite time-interval T = [a, b~\. Let R{t, s) be the 
correlation function of the process ^{t). In view of Theorem 5 in Section 3 
the process ^ {t) is expandable into the series 

00 

i{t)= Z s/h9k{t) ik> 

fc=l 



where (pf^ {t) is an orthonormal sequence of eigenfunctions and 2^ are the 
eigenvalues of the correlation function on (a, b): 



b 






R{t, s) (pj,{s) ds. 



and is a normalized uncorrelated sequence 

^ikl=K- 

Clearly fc=l, 2, ... form a basis in b)}. Therefore 

1 = t 

n=l 



where 



b 




^(t) (p„{t) dt, n = l, 2 ,... and 



c„=ECl„=| R^iit) (p„{t) dt. 



a 



R,,{t)=Ea(t). 
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The mean square error S of the estimator is calculated from the formula 



00 



S^ = E\C\^-E\l\^ = E\C\^- X 

n=0 



(Pn(t) dt\ . 



The practical applicability of this method is hindered by the complexity 
in computing eigenfunctions and eigenvalues of the kernel R{t, s). 

Wiener’s method. Let ^ (t) and ((t) (eTbe two Hilbert random functions. 
Assume that the process ^(t) is observed on a certain set T* of values 
of t. The problem is to determine the optimal estimate of the value of 
C{to), t^eT in terms of the observed values of ^{t), /gT*. If we assume 
that the required estimate is of the form 






c{s) ^{s)m(ds), 



T* 



( 5 ) 



where m is a measure on 7* and the conditions are satisfied which 
assures that the integral is meaningful, then equation (4) becomes 



J c (5) (5, t) m (ds) = R^^ {to, t), teT*, ( 6 ) 

T* 

where R^^ is the correlation function of ^{t) and is the cross correla- 
tion function of C(r) and ^{t). Equation (6) is a Fredholm’s integral 
equation of the first kind with a symmetric (Hermitian) kernal. A solution 
for this equation doesn’t always exist. However, if 

J E\^{t)\^ m{dt)<cc, 

T 

then the integral equation (6) possesses a solution c(5)g if 

only if the optimal linear estimate C{to) of the value C(t) is of the form (5). 

Let T be the real line, T*=(a, b) and let the processes ^{t) and C(0 
be stationary and stationary correlated (in the wide sense) and let m 
be the Lebesgue measure. Then equation (6) becomes 



I' 



c (5) R^^ {s — t)ds = R;-^ {to — t), te {a, b) 



( 7 ) 



If, however, l^{t) = ^{t) ( — oo<r<oo) and to>b, i.e. if the problem is 
to estimate the value of ^{to) in terms of the values of ^{t) in the past, 
the problem then is one of pure forecasting. 
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We shall discuss in some detail the problem of forecasting the values 
of + knowing some values of the process {(s) before the time t, 
t^s. It will be assumed here that the processes ^{t) and C(t) are stationary 
and stationary correlated (in the wide sense). The forecasting variable 
C{t) will be regarded as a function of t for a fixed q. It is easy to observe 
that the variable C(0 defined by equation (7) is a stationary process. 
Indeed, equation (7) is of the form 

t 

Ct(s) R^^{s — u) ds = R^^{t-t-q — u), u^t. 

fL 

— 00 

Transforming the variables t — u=^v and t — s = x we get the following 
form of (7) in terms of v and t : 

oo 

Ct{t — T)R^^{v — T)dT = Ri^^{q + v\ v^O. ( 8 ) 

0 

We thus observe that the function — t) is independent of t. Set c(t) = 
= Ct(t — T). Equation (8) becomes 



c{s) R^^{t-s) ds = R^^{q + t), 



( 9 ) 



0 



and formula (5) for the forecasting function is of the form 



t 

-I 



C{t)= c{t — s)^{s)ds=^\ c(s)^{t — s)ds 



(10) 



Thus the process l{t) = Cq(t) is stationary. It follows from (10) that c{t) is 
an impulse transition function of a physically realizable filter which 
transforms the observed process into an optimal estimator of the variable 
C (t + ^)- 

It is easy to derive the expression for the mean square error S of the 
forecasting function C(t). Since is the square of the length of the 
perpendicular dropped from the end of vector C(t + ^) onto 

s^t}, 

3^=E\at+qr-E\Utr= 



= R,,{0)- 



c{t) R^^(t — s) c(s) ds dt. 



0 0 



(11) 
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Setting R^^{0) = a^ and using the spectral representation of the function 
we obtain 

00 

S^ = (Tf- j \c(iu)\^ dF^^{u), (12) 

— 00 

where F^^{u) is the spectral function of the process ^{t) and 



00 

c{iu) = 

0 



c{t) e 



We shall briefly describe the method of solving equation (9) proposed by 
N. Wiener. Assume that the spectrum of the process ^{t) is absolutely 
continuous and the spectral density admits factorization (cf. 

Theorem 3, Section 7) 



f^^(u) = \h{iu)\^. 




a{t) e dt. 



Re z ^ 0 . 



It follows from Parseval’s equality for the Fourier transform that 



R^^{t)= \h{iu)\^ du= a{t + s)a{s)ds. 



Assume also that the cross spectral function of the processes C (t) and ^ (r) is 
absolutely continuous and its density satisfies the condition 

/«(“) 

(13) 



h(iu) 



- = k(iu)e^2' 



Then 



C» 00 00 

J ^‘'“ /«(«) j* 6““ k(iu)h{iu)du = ^ b{t + s)a(s)ds. 



b{t) = 




f k{iu) e“' 



du. 



— 00 



where 
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Using the above expressions, equation (9) can be rewritten as 



[b{q-\-t-\-s)— \ c{t) a(t — T + s) dx'] a{s) ds = 0, t>0. (14) 



In order that (14) hold it is sufficient that the function c{t) satisfy equation 



b{q + x)—\ c(t) a{x — x) dx, x>0. 



Equation (15) is of the same type as equation (9), the only difference 
being that the function a{t) vanishes for negative values of t. Writing (15) 
in the form 



b{q-hx)=\ c{x) a{x — x) dx, x>0, 



we can solve this equation directly using the Laplace transform. Multi- 
plying equality (16) by and integrating from 0 to oo, we obtain 



where 



Bq(z) = C(z)h(z), 



Bq{z)= b{q + x)e dx. 



C(z) = c{t) e dt. 



1 r 



h{z) J h{iu) 

^ — 00 

where the expression for Bq{z\ Re(z)>0 can be presented as 






h{iu) {z — h 
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The statement of the assumptions required for the validity of formulas (17) 
and (18) is very cumbersome. It is simpler in each particular case, to 
check directly the admissibility of the transformations leading to the 
solution of the problem at hand. 

Yaglom’s method. In contrast to Wiener’s method, Yaglom’s procedure 
determines the frequency characteristics of an optimal filter rather than 
the impulse transition function which may not even exist. The general 
formulas for the solution of the problem are not given, only the method 
of choosing the function sought, based on the conditions it should satisfy, 
is presented. In many important cases this choice can be easily made. 

Let a two-dimensional stationary process (f), C (0) admit the spectral 
representation 



^ {t) = Vi (du), C (0 = ^2 (du) 



— 00 



— 00 



with the matrix of spectral densities given by 

/ /«(") /«(w)\ 



As before we shall consider the problem of an optimal estimate of the 
variable C(t + ^) given some values of the process ^{s) for The fore- 
casting process l{t) is subordinated to ^{t). Therefore 



I {t) = c (iu) V I (du). 



t. 



\c{iu)\^ f^^{u) du< CO. 



The equation 



(19) 



EC(t + g)<^(s)=EC(f)<^(s), 



which determines the process C(t) becomes 



^lus ^^luq ^ j du = 0, S > 0 . 



(20) 



In addition to conditions (19) and (20) we also have the requirement that 
c{iu) be the frequency characteristic of a physically realizable filter. These 
conditions will be satisfied if : 




§8. Forecasting and Filtering of Stationary Processes 



281 



a) the function is bounded; 

b) c{iu) is the limiting value of the function c{z)eJ ^2 ^ 

c) \l/{iu) = e"''^f^^{u) — c{iu) f^^{u) is the limiting value of the function 
il/{z) in ^2 • 

Here i^i) denotes the space of functions h{z) analytic in the 
right-hand (left-hand) half-plane for which the integral 



\h{x+iu)\^ du 



— 00 

is uniformly bounded for x>0 (x<0). 

00 

Indeed it follows from b) that f \c{iu)\^ du<oo and this in conjunc- 



— 00 

tion with a) assures that condition (19) will be satisfied. Moreover, it 
follows from b) that c(iu) is the frequency characteristic of a physically 
realizable filter. It follows from condition c) that e-‘>Uu)-c{iu) f,,(u) 
is the Fourier transform of a function which vanishes for the positive 
values of the argument. Thus, the relation (20) is proved. 

Note that condition b) rejects all the filters with frequency characteris- 
tics increasing at infinity. Such frequency characteristics correspond to 
operations connected with differentiation of the process ^(t) and are 
often encountered in the course of construction of optimal filters. There- 
fore it may be desired to substitute condition a) by a less restrictive one. 
Assume that c(z) is a function analytic in the right-hand half-plane and 
let |c(z)|-^oo as |z|-^oo but not faster than a certain power of z (say the 
r-th). The function 






_c^) 

z 

1 +- 

n 






Since |c„(z)^ |c(z)|, we have 



lim I c„ {in) — c {iu)\ ^ {u) du = 0, 

n->oo J 
— 00 

provided condition (19) is satisfied. Thus c(iu) is the limit in if 2 of the 
frequency characteristics of physically realizable filters and therefore 
c(iu) is also a frequency characteristic of such a filter. We have thus 
obtained 
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Theorems. If the spectral density /^^(w) of the process ^{t) is bounded, 
then the conditions 

00 

a) J \c{iuyf f^[u) du<oo , 

— CO 

b) c{iu) is the limiting value of the function c{z) analytic in the right- 
hand half -plane and increasing as |z|->oo not faster than a certain power 
of\zl 

c) ^{iu) = e^'*'^f^{u) — c{iu) is the limit value of the function ij/(z) 
belonging to 

determine the frequency characteristic c{iu) of an optimal filter estimating 
the variable + 

The mean square error 5 of the optimal estimator is equal to 



S = {E\i:{t + qr-E\UtrV'^ = 

00 

= |<Tc- I |c(j«)|^ 



( 21 ) 



Example 1. Consider the problem of pure forecasting of the process i{t), 
{^{t) = l^{t)) with correlation function R{t)^G^ The spectral 

density is easily found to be — 2 * ^he analytical continu- 

X4r I OC 

ation of the function il/{iu) is of the form 



lj/(z) = 



c(z) — e^^ 

(z + a) (z — a) n 



The function il/{z) possesses a unique pole at the left-hand half-plane 
z = — a. To neutralize this pole by means of function c{z) analytic in the 
right-hand half-plane it is sufficient to take c(z) = const = With this 
choice of c(z), condition a) of Theorem 3 is satisfied. Thus 



c{iu) = e 1^(0= J 

— 00 

i.e. the best formula for the optimal forecasting of the variable ^{t + q) is 
the following formula: 

which depends only on the value of ^{t) at the last observed instant of 
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time. The mean square error of extrapolation is equal to 






Example 2. Consider again the problem of pure forecasting of the pro- 
cess {(^), i.e. the estimate of ^{t + q) by means of the observed values of 
^(^), If the spectrum of the process ^{t) is absolutely continuous and 
condition (24) of Section 7 is satisfied, then the spectral density of the 
process admits factorization f^^{u) = \h{iu)\^ , where h{z)eJ ^2 has 
no zeros in the right-hand half-plane. 



P(z) 

Consider the important practical case when h{z) = -^^ where P{z) is 

a polynomial of degree m and Q{z) is a polynomial of degree n{m<ri). 
Assume also that the spectral density f^^{u) is bounded and does not 
vanish. Then the zeros of the polynomials P {z) and Q (z) lie in the left-hand 
half-plane. Let 



P{z)=A n Q{^) = BY\ (z-z/^ 



J=1 



7=1 



Set 



7=1 



X ccj = m, Y. = 



7=1 



p,(z)=(-i)- T n ei(^)=(-i )” B n 



7=1 



7=1 



The analytic continuation of the function ij/{iu) is of the form: 



il/{z) = (e^^-c(z)) 



P(z) Pi(z) 

e(z) Qi(z)' 



The function c{z) should be analytic in the right-hand half-plane, and 
ij/{z) in the left-hand one. Therefore c{z) should be analytic in the whole 
complex plane and may have poles at the zeros of the polynomial P{z) 
where the order of the pole does not exceed the order of the correspond- 
ing zero of P{z) Therefore 



c(z) = 



M{z) 

P(z).’ 
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where M{z) is analytic in the z-plane and has no singularities for finite z. 
Since the growth of c{z) is at most exponential, M(z) is a polynomial. In 
view of the integrability of the square of the absolute value of the function 

P{iu)_M(iu) 

^ Q{iu) Q{iu) 

the degree of the polynomial M{iu) cannot exceed n—\,m^^n—\. 

On the other hand, the choice of function c(z) as given above guar- 
antees that conditions a) and b) of Theorem 3 will be satisfied. It remains 
to choose polynomial M(z) in such a manner that function 

[e«P(z)-M(z)] P,(z) 

— m — mu- 

or equivalently function 

Qi^) 

will have no poles in the left-hand half-plane. A necessary and sufficient 
condition for this is the fulfillment of equalities 



d^M{z) _d\e^^P{z)) 
dz 2 = Zjc Z = Zk 

i = 0, 1 ,..., k = \,...,r. (22) 



The problem of constructing a polynomial M(z) satisfying condition (22) 
is a standard problem in interpolation theory and always has a unique 
solution in the class of polynomials of degree n — 1. If we find polynomial 
M (z) we thus obtain at the same time the frequency characteristic of the 
optimal forecasting filter 



c (iu) = 



M{iu) 

P{iu) 



The following method of determining function c(z) may be used. 
Expand the functions P{z) Q~^ (z) and M{z) Q~^ (z) into partial fractions. 
Let 



P{z)^ y 'y M{z) ^ ^ y,j 

Q{z) t = 1 (z - z^’ Q{z) k=i j = 1 (z - ZkY ' 



In order that the function xj/^ (z) have no poles at points z^, /c= 1, . . 
it is necessary and sufficient that 



r. 



dx^ 



z(z-z/'‘iAi(z) lz=z^ = 0, 7 = 0, 
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and moreover 



r 



•Ai(z)= E 

k= 1 



Pk 



E 



Ckj^“-ykj 

{^-hy 



Simple calculations show that 



q '• 

Ctj ■ + — 1 +— + 2 + • • ■ 



r. 

Knowing the coefficients we can write down the expression for c{iu)\ 

y y 

(■ \^_L_ y y _ fc=ij=i(z-4y 

h(iu)tyi jei(z-zty ^ Y 

k=ij=i {z-hy 



Example 3. Assume that the process C(‘y) (*^=^0 observed, but the re- 
sults of the observations are distorted by various interferences (noise) so 
that the observed values yield a certain function (^ (^) 5' ^ which is differ- 

ent from ((^). Assume that the value of the interference (noise) rj{t) = 
= ^(t) — C{t) is a stationary process with mean value 0. It is required to 
estimate the value of C(^ + ^) using the results of observations on <?(*?) = 
= C(:s)-y-rj{s), s^t. 

Such problems are called filtering or smoothing (here we are required 
to filter out the noise f]{t) or to smooth the process ^{t), i.e. the non- 
regular noise is to be extracted). Moreover if ^>0 we have filtering with 
forecasting and if ^<0 the problem is filtering with delay. 

Assume that the noise rj{t) and the process C(0 uncorrelated and 
possess spectral densities and /,^(w). Then 



(f ) = (0 + ^cc (0 > At («) =fm (") + At (“) • 

Since 7?^^(/) = i?^^(/), there exists a cross spectral density of the processes 
C(/) and ^(t) and^(u)=f^,^(u). 

Let 






w -ha 






Then 



/«(«)= 



c,{u^ + y^) 

(u^ + a^) (u^ + py 



c, = c, +c 



2 ■> 



C1+C2 
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For the function il/(z) the following expression is obtained 

, / . -Cl e"«(z^-^2) + C3c(z) (z^-y^) 

■ 

Let 0. The function i/^(z) should be analytic in the left-hand half-plane 
and must belong to J ^2 • order that this be satisfied it is required that 
the numerator vanish at points z = — a and z = — jS. This leads to equa- 
tions 



c(-p)=0. 



c(-a) = 



Cl e — )9^) 



C3 






(23) 



Moreover c(z) is analytic in the left-hand half-plane (and also in the 
right-hand in view of condition b) except at the point z = —y, where it 
has a simple pole. Therefore 



c(z) = 



y(z) 

z-f y’ 



where q>{z) is an entire function. It follows from the finiteness of the 
integral 

00 

|c(iM)l^ /«(m) du 

% 

— 00 



that (p{z) is a linear function, (p{z) = Az-\-B. 
From (23) we obtain 



c(z) = A 



z + P 

z-f y’ 



C3 y + a 



Therefore the formula of optimal smoothing with forecasting has the 
form 



Ut)=A 



00 



— 00 



iu + P 

iu-\-y 



Vj (</«). 



Recalling that (i« + y) * is the frequency characteristic of a physically 
realizable filter with impulse transfer function we obtain 

t 

Ut)=-^e-‘^{at)+{p-y) f e-y^‘-^^i{s)ds}. (24) 

C3 y + a J 



— 00 
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For ^<0 formula (23) is not valid. Formally it is due to the fact that the 
function is not bounded in this case in the left-hand half-plane. 
Function i/^(z) for ^<0 can be determined from the following consider- 
ations. Let e^^{z^-p^)-\-C 2 c{z){z^—y^). Then c{z) should 

be analytic in the left-hand half-plane except for the point z = —y and 
— a) = i/^i( — jS) = 0. Since 

c,{z^-y^) 

and c{z) is analytic in the right-hand half-plane, (z) is an entire function 

and 



<Ai(y)= - Cl (25) 

Set 



^i(z) = A(z)(z + x){z + P). 



The function A (z) should be entire. It then follows from condition a) of 
Theorem 3 that ^ (z) = const = ^. The value A is determined from equa- 
tion (25) 



A = Ci 



— y + P 

yq r H 

a + 7 



Hence 



' ’ Cs (a+y)(«^+y^) ' ' 



The methods of forecasting and filtering analogous to those described 
for continuous parameter processes are applicable for stationary se- 
quences. The general solution of forecasting of stationary sequences is 
presented in the next section. Here we confine ourselves to one example. 



Example 4. Consider a stationary sequence (^{t) which satisfies the sim- 
plest autoregression equation 

+ -\-ap^{t-p) = Yi{t), (27) 

where r][t) is a standard uncorrelated sequence and i{t) is subordinated 
to rj{t). Let 

n 

>?(0= J e‘'“dC(«) 



be the spectral representation of the sequence be a process 

with uncorrelated increments and structure function ^ l{AnB), where / 

2n 
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is the Lebesgue measure. The spectral representation of the sequence ^ {t) 
should be of the form 

n It 

(t) = J (p {u) dC (w), where J \(p {u)\ ^ du<co. (28) 

— 71 -It 

Substituting (28) in (27), we obtain 

n Tt 



where P(z)= ^ dj^z^. Hence 

fc = 0 



^ (u) = ^ - (mod /) . 

P(^") 



1 



Assume that P(z) has no zeros in the closed circle |z| ^ 1. Then eH 2 - 

If 

P{z) k = 0 \ CloJ 



then 



00 



<^(0= Z Kri{t-n), 

n = 0 



and we have obtained the representation of the sequence ^{t) in the 
form of a response of a physically realizable filter on an uncorrelated 
sequence rj (t). Since 

i{t)= -J- laii{t-^)+- -+api{t-p) + ri(t)l, (29) 

Uq 

the optimal forecast based on the given ^{t — n) (n = l,2,...) is of the 
form 



l(t)= — f [ai^(t-\) + a2<^{t-2)+...+ap^(t-p)']. 

aQ 



The minimal mean square error of the forecast is equal to 



<5(0= E 



l>?(OI ‘ 

l«ol' 



2) 1/2 
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Repeated application of formula (29) allows us to obtain an optimal 
forecast for a number of steps forward. 



§9. General Theorems on Forecasting Stationary Processes 

In this section certain general theorems on forecasting stationary se- 
quences and processes from the infinite past are discussed. As before we 
refer to a wide-sense stationary process with expectation 0 as simply a 
stationary process. 

Forecasting stationary sequences. Let t = 0, ±1, ±2,...} be a sta- 
tionary sequence. Denote by the closed linear span in if 2 generated 
by all the variables ^{t) and by the closed linear span generated by 
the variables ^{n), n^t. Clearly, and is the closure 

00 

of U Consider the shift operation in This operation is 

t = — ao 

defined by the equality 

Sri=Y^c,atk+i)- 

provided is of the form f] = Y, ^kUh)- The operation S possesses 

the inverse S~ ^ 



and preserves the scalar product : 

=Y'Zc,3MUh+i)a^r+i))= 

k r 

=x I E(i lA^). 

k r k r 

Hence S can be extended by continuity to the whole of Moreover it 
becomes a unitary operation in 

We introduce the spectral representation of the sequence ^{t) 

n 

— n 

where v is the spectral stochastic measure with the structure function F. 
Henceforth we shall not distinguish between measure F{A) and the 
spectral function of the sequence F(w) = F[ — 71 , w) which generates this 
measure F(A). 
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Recall that the random variable rj belongs to ^ if and only if 



rj= (p(u)v{du\ where (pe^ 2 {P}‘ 



Consider the sequence of random variables 

rj{t) = S^rj (t = 0, ±1, +2,...). 

Lemma 1. The sequence f](t) is stationary and has spectral representation 

n 

(p{u)v{du). (1) 

- n 

The stationarity follows from the fact that the operator S is unitary 
Eri(t + s)t]{s) = {r]{t + s), rj(s)) = 

= (S‘^%S^fj) = (S% 

Finally the spectral representation (1) can be easily verified for the ele- 
ments rj of the form q = ^ and this representa- 

tion is obtained for arbitrary rj by means of the limiting transition. □ 
We note the following additional properties of the operator S: 

a) SJf^{t) = je^{t+l); 

b) if is the projection of ^{t) on Jif^{t—p), then 

1 ) (/) ^ 1 ) (r + 1 ) , (t) = (t + q). 

Since 



E\^^P\t + q)\^=E\S^^^^\t)\^=E\^^P\t)\\ 

the quantity E\^^^\t)\^ does not depend on t. Therefore the following 
quantity 

s^p)=Em-^^^\tr=E\m^-^\i'^\ 

which equals the square of the minimal mean square error of forecasting 
^(t) by means of i^(n), n^t—p, is also independent of t. 

Clearly, 

The equality d^{n) — o^ means that <^(/) is uncorrelated with all the vari- 
ables ^{k), k^t — n for all t, so that the knowledge of these terms does 
not help as far as the forecasting of ^{t) is concerned. If (5(1) = 0, then 
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hence = and in general = for 

any t and n<t. Set = H In our case This means that 

t 

if the sequence of values of the process c^(n), n^t, is known, then all the 
succeeding terms of this sequence can with probability 1 be linearly ex- 
pressed in terms of those observed. In a certain sense the opposite case 
is that in which = 0 (where 0 denotes the trivial subspace of con- 

sisting of the singleton 0). Here the knowledge of the terms of the se- 
quence ^{n) (n^s) add little as far as the forecasting of the variable 
( 5 ' + /) is concerned for large /, since lim E | («)| ^ = 0 and lim (r) = 

CO t-^CO 

Definition 1. If then the process ^ (/) is called singular (or deter- 

minate"^)', if ^(1)>0 the process ^{t) is called undeterminate', if = 0 
the process is called regular (or completely undeterminate). 

Definition 2. Let ^i{t), i=\,2, be Hilbert random processes, teT, T be 
an arbitrary set of real numbers and = seT}. We 

say that ^i{t) is completely subordinate to the process ^ 2 {t) if 
cij^^^[t) for all te T. 

Theorem 1. An arbitrary stationary sequence admits representation of the 
form 

at)=at)+n{t), ( 2 ) 

where (^^(/) and rj{t) are mutually uncorrelated sequences, completely sub- 
ordinate to ^{t), 4(0 singular and q{t) is regular. The representation (2) 
is unique. 

Proof. Clearly SJ^l = ^|. Since S is unitary, the orthogonal complement 
of is invariant under S, i.e. S maps one-to-one the subspace = 
= on itself (here is the orthogonal complement of in 

the space 

Let 4(0) be the projections of (^(0) on q{0) be the projection of 
(^(0) on and 4(0 = *^^4(0), ^(0 = ‘^^^(0), ^ = 0, ±1, ±2,.... Since 
(^(0) = 4(0) + ^(0), then c^(/) = 5^40) = 4 (0 + ^(0^ where the sequences 
q(t) and 4(0 stationary, mutually uncorrelated and subordinate to 
^(0 (Lemma 1). 

Next since in relation (2) 4(0^^^! then (0 n c: 

(0- Therefore . On the other hand 4(0^'^! yields 

j^^[t)c^J^l. Therefore for any t = — = i.e., the sequence 

4(0 is singular. Next it follows from the equality q{t) = ^{t) — ^A^) th^t 
^(0ec^^(0- Therefore = n On the other hand J^;^(0 is 



deterministic using Doob’s terminology. Translator’s Remark. 
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orthogonal to by definition. Therefore = the process rj{t) is 
regular. 

The uniqueness of representation (2) follows from the fact that under 
the conditions of the theorem the projection of rj {t) on is zero, = 
= and consequently, 4(0 is the projection of ^{t) on The 

theorem is thus proved. □ 

Sequences rj(t) and 4(0 called the regular and singular compo- 
nents of the process ^(/) respectively. 

Theorem 2. The regular component rj(t) of a stationary sequence can be 
represented in the form 

00 

»/(0= Z (3) 

n=0 

where C (0 (^ = ^? ±4-*) is a standard uncorrelated sequence, J^ft) = 

00 

= J^^(t)and Y, \ci(n)\'^ <QO. 

n = 0 

Proof. We introduce the subspace G{t) = J^^[t)QJ^^{t—\). This space 
is one dimensional (if it were 0-dimensional, then (5^ ( 1 ) = 0 and q (t) would 
be a singular sequence). We choose in G(0) a unit vector C(0). Then the 
sequence C(0 = ‘^^C(0) is orthonormal (C(t)eJ^^(t)QJ^^^(t—l) therefore 
C(t) is orthogonal to J^f^^(t-l), and moreover C(k)eJ^^(t — 1) for k<t), 

j^ft)ci3e^(t), n ^^(o^n =^,(0=0. 

t t 

This means that the sequence C(0 forms a basis in Expanding ^(0) 
in terms of this basis we obtain 

00 00 

q{0)=Y ^(^)C( — «), where Y \ci{n)\^ = E\q{0)\^ <co. 

n=0 n=0 

Applying operator to the given expansion of the variable ^(0) we ob- 
tain equality (3). Relation Jf’^(/)c: follows directly from (3) and the 

inclusion in the other direction follows from the definition of C(t). The 
theorem is thus proved. □ 

Remark 1. We may assume without loss of generality that a (0) is positive. 

Lemma 2. Let the spectral function F{u) of a stationary process ^{t) be 
equal to F^{u) + F 2 {u) where Fj(w) are non-negative monotonically non- 
decreasing functions and the measures Fi{A) which correspond to functions 
F^{u) are singular. Then a decomposition ^(/) = 4 ( 0 + ^ 2(0 exists where 
the processes 4(0 subordinate to ^{t), are orthogonal and have spectral 

functions Ffu) (/ = 1, 2). 
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To prove this assertion we represent the interval as the sum of dis- 
joint sets Pi and P 2 such that F 2 {Pi) = F^{P 2 ) — 0, Set 



n 



ii(t)= e‘'“ Xp,(m) v(rf«), 



— n 



n 

r 



«.(.)= j 

— n 



e‘’" Xp,(m) v{du), 



where v is the stochastic spectral measure of the process ^{t),XPj(u) is 
the indicator of the set P/. Then 



Ut)+Ut)= 



e‘"‘ v{du)=^{t). 



i2ih)= j ‘^*"Zp,(M)xp,(M)rfF(«) = 0, 



n 



eiO,->z)u Xp.(u)dF(u) = 



n 

f* 



- ti)u 



dFj{u\ 






2 , 



which proves the lemma. □ 



Theorem 3. On order that a sequence ^{t) be non-de ter minute it is neces- 
sary and sufficient that 




du> 



-00, 



( 4 ) 



where f {u) is the derivative of absolutely continuous component of F[A) 
( with respect to the Lebesgue measure ) . 

Proof. Necessity. Let F^{u), P^(w) be the spectral functions of the se- 
quences rj{t) and 4(0- Since ri{t) and 4(0 uncorrelated, it follows that 



F(u) = F,{u) + Fs{u). 



In view of Theorem 2 and Theorem 2 of Section 7, F^{u) is absolutely 

It 



continuous and for f{u) = F^{u) the condition 



\nf{u)du>—co is 
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satisfied. Decomposing the measures F(A) and F^{A) into absolutely 
continuous and singular components with respect to the Lebesgue 
measure, we obtain 

F(.4) = | /(u) du + F*{A), F,(.4) = | /,(«) du + Ff{A). 

A A 

It thus follows that 

f{u)=friu)+fs(u) 

and 



lnf{u)du^ \nf^{u) du> — CO . 



Hence, if the process is non-determinate, (4) is valid. 

Sufficiency. Assume that the process ^(t) is singular. In this case the 
decomposition of ^{t) into two uncorrelated components ^^{t) and ^ 2(0 
subordinate to ^{t) corresponds to the decomposition of F{A) = F^{A) = 



= fs{^) du-\-Ff{A) (cf. Lemma 2). Let In/ (u) du= ln/(w) du > — 00. 



Then in view of Theorem 2, Section 7, X! where 

n = 0 

is an uncorrelated sequence. Since 

and = 

we have n which contradicts the relation 

^ 0 so that the process ^{t) cannot be singular. Therefore 



Inf {u) du= — CO . 



The theorem is thus proved. □ 

We now consider the problem of forecasting non-determinate pro- 
cesses. Utilizing Theorems 1 and 2 we write 

00 

i(t)=is(t)+fi{t), n(t)= E 

n = 0 

Since 4(t) is completely predicted “from the past” it is sufficient to 
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consider the forecasting of the regular component rj{t) of the process 
^{t). It follows from Theorem 2 that the projection of rj{t) on — 
coincides with the projection on Consequently 

00 

E (5) 

n = q 

The value of the mean square error is determined from the equalities 

<5^(g)= Z (6) 

We now obtain a formula for optimal forecast which does not involve 
the sequence C(n). Since 





296 



Chapter IV. Linear Theory of Random Processes 



Now we have 



(t) = 




a„ e 



(p{u) v{du). 



hence 



where 




( 7 ) 

( 8 ) 



We now present a method to determine the function g{z)= ^ 

n = 0 

where b„=— — d„. This will give us a general solution of the problem 

of forecasting a stationary sequence as well as the formula for computing 
a mean square error of the forecast. The function g(z)eH 2 , g{0)= 

y/ln 

is real (cf. Remark 1 after Theorem 2). The spectral density of the se- 
quence rj(t) is factorized in terms of the function g{z), namely we have 
fr{u) = \g{e^'")\^. In view of Remark 2 after Theorem 1 Section 7, function 
g(z) is determined uniquely by means of provided it has no zeros 
in the circle |z| < 1 and ^(0)>0. Therefore if function g{z) - constructed 
in accordance with Theorem 1 of Section 7 - does not vanish for |z| < 1, 
it is then identical with function g{z) obtained in the course of the proof 
of Theorem 1 in Section 7. 



Lemma 3. Function g{z) does not vanish in |z| < 1. 



Proof. First we note that if/^(w) = |/z(e'“)|^, /z(z)= ^ c„z", ^ |c„|^<oo, 

«=o «=o 

then |col^. Indeed, 



«(0)- Z 



n 



1 - Zc, 

k=l J \k = 0 






du^27i\cQ\‘ 



Since this inequality is valid for any d^ and N, 

S^{1)^2k\cq\^. 



(9) 
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Assume now that g{zo) = 0, |zo|<l. The function g{z) = a„z^ 

vanishes at point Zq. Set 



g{z) = {z-Zo) Y, b'„z", where b’o= ^ — . 



Then 



l0(e’“)| = 



"=o y/ln 
l-e"‘“zo 



e 



=\g{e-“‘)\ = 



e ‘“-zo 



|e-“-Zo 



E b'„^ 



« = 0 



Z 

n = 0 



E B: e‘”“< 

n = 0 



where Bq = Bq = 




— . It follows from (9) that 
^0 



<5^(l)>23r|E^|^ 



«o 



2 



^0 



which is impossible in view of (6) if \zq\ < 1. The lemma is thus proved. □ 

Corollary. In the formula of the optimal forecasting (7) the function 
g{z)eH 2 is uniquely determined (provided that ^(0) is positive) and coin- 
cides with the function obtained in Theorem 1 of Section 7. 

We have thus solved the problem of forecasting for the regular part 
of a non-determinate sequence. The following points should now be 
clarified: how to express the spectral density of the sequence ?/(?) in terms 
of the spectral function of the process ^{t). What is the form of the fore- 
casting formula for the sequence ^{t) expressed in terms of the charac- 
teristic quantities of ^ (t) ? 

Lemma 4. Let a non-determinate process ^ (/) be represented as ^ (^) = 
=n{t)+is (/), where r]{t) and ^^{t) are uncorrelated, is a singular 
process and rj{t) is a regular process, F{u), F^{u) and F^{u) are spectral 
functions of the sequences ^{t), rj{t) and 
Then equality 

F{u) = FM + Fs{u) (10) 

is a decomposition of function F[u) into absolutely continuous F^{il) and 
singular F^{ii) components with respect to the Lebesgue measure. 
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Proof. Formula (10) follows from the fact that the sequences t]{t) and 
4(/) are uncorrelated. We introduce a spectral representation of an un- 
correlated sequence l{t) appearing in representation (3), 

n 

C(0= j (11) 

— 7t 

where C{A) is a stochastic measure with a structure function ^ I {A), 

2n 

where / is the Lebesgue measure. Substituting (11) into (3) we obtain 

n 

r,{t)= f 



Let 



e'"‘v,{du) (12) 

— n 

be the spectral representation of the sequence ^ft). Then 

n It 

^ (f) = j* e‘'“ V (du) = j* g (e‘“) C (du) + v, (tft/)] . 

- 7t —n 

It follows from the last equality that 

It n 

(p (u) V (du) = J (p{u) g (e'“) C (du) + (du)~\ ( 1 3) 

— 7t —n 

for any function (p{u)e ^2 {^}- 

Another spectral representation may be given for function ^(t). 
Since it follows that 



4(0)= 



(p,(u) v{du). 



hence 



4(t) = S'<^s(0) = 



(Ps{u) v(du). 



— Tt 
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Taking (13) into account we obtain 



^t)= (p,{u)[.j2ng(e‘“)C{du) + v,{du)]. (14) 



Comparing (14) with (12) we see that 

It It 

- J e‘‘'‘{(p,(u)-l) v,{du)= e““ (p,(u) Jin. g{e‘“) C{du). 

— 7t - 71 

The elements appearing in the different parts of this equality belong 
to mutually orthogonal subspaces. Therefore they are equal to zero. 
Consequently, 

(P, ( m ) = 1 (mod F,), (P, ( m ) g (e‘“) = 0 (mod /) . 

Since ^(^'“) may equal zero only on the set of /-measure 0, (Ps{u) is zero 
almost everywhere. Let S be the set on which (p^{u) = l. Then 1{S) = 0. 
Therefore 



FM)=\ \(Ps{urdF{u) = F{AnS), 



A 



F,(.4) = J 2n\g{e^^)\^ du. 

A 

The lemma is thus proved. □ 

Lemma 5. Let cp^ (w), (p 2 (u) and (p^ (u) be such that 



(pi{u)\ (du), (p 2 (u) V (du), 



(P 2 (u) v,{du) 



are projections of the variables ^{t), rj(t) and ^^{t) on the spaces J^^(t — q), 
{t — q) and {t — q) respectively. 

Then 

1 (m) = <P 2 (m) = <?3 (m) = ( 1 - j (mod F) . 

\ g(e‘J 



In view of formula (7) it is sufficient to prove that (Pi{u) = (p 2 (u) = (p 2 {u). 
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It follows from equality 

K 




(Pi{u) v{du) = 



It 




(Pi{u) v{du) 



+ 




(Pi{u)v,(du) 






(15) 



— n —It 

and from the orthogonality of the summands appearing in the brackets 
of the r.h.s. of equation (15) that 

n 



<5|(^f)= E/7(0- 



(pi{u) v{du)\^ + 



+ E 




(Pi{u)v,{du) 






where the equality is attained if and only if 

(Pi{u) = (p 2 {u) (modF,), q>i{u) = (p 3 {u) (modF,), 



K 



4 ( 0 = 



(Pi(u)v,{du). 



On the other hand, in view of the definition of ^(0 3f(q)=S^(q). The 
lemma is thus proved. □ 

The results obtained may be formulated as follows: 

Theorem 4. If ^(t) is a non-determinate stationary sequence, then the op- 
timal forecast (/) of the variable ^ (/) based on the observations of ^ (^), 

s^t — q is given by the formula 



K 




where v is the spectral stochastic measure of the sequence ^{t), 

g{^)= Z 9q{^)= Z 

n=0 n=0 

moreover, the function g(z)eH 2 does not vanish in the circle |z|<l, ^(0) 
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is positive and = f{u), where f[u) is the derivative of an absolutely 

continuous component of the spectral function of the sequence ^{t). The 
square of the mean square error in forecasting is given by 

n 

JL j In f(u)du 

6^{q) = 2n X 

n = 0 

where c„ is determined from 

It 

exp{^ E I ‘^“1= 



In particular 



Jl_ J In f(u)du 

= (16) 

The theorem follows directly from Lemmas 4 and 5, formula (7) of 
the present section, and from Theorem 1 and Remark 2 in Section 7. □ 

Forecast of continuo'us parameter processes. Let ^(t) (— oo</<oo) be a 

stationary process. 

00 

^(0= j* e*“'v(dM), 

— 00 

where v is an orthogonal measure on the line ( — oo <u< oo), 



E^(t)=0, 



R{t)=E^{t + s)i{s) = 



OO 



J 

— 00 



e^^^ dF{u), 



F{-\-C0) = (7^ . 



We introduce the Hilbert space 

= —CO <t<co} 

and its subspaces = —co<s^t}. Define in the group 

of shift operators 5” ( — oo </i< oo), by putting 

(4)) = Z 

k k 

and extend by continuity the definition of over the whole Then 
forms a group of unitary transformations of This group possesses 
the same properties (with obvious modifications) as the group of trans- 
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formation S” in the discrete parameter case. The problem of optimal 
linear forecast of processes ^{t) is to find a random variable ^^{t) such that 

for any rje^^{t—T). This problem has a unique solution: the variable 
^j(t) is the projection of ^{t) on Set 

d,{T) = 5{T) = ^E\i(t)~ir{tf. 

The quantity S(T)~ the mean square error of the forecast - is a monotone 
non-increasing function of T and 0^S{T)^a. If lim^(T) = flr, then the 
process is called regular (purely non-determinate). If (5(To) = 0 from 
some Tq, then — Tq) for any t. Consequently 

00 

k=i 

for any t and 3{T) = 0 for all T>0. In this case the process is called 
singular (determinate). Non-singular processes will be called non-deter- 
minate. 

The proof of Theorem 1 is directly carried over to continuous-time 
processes: an arbitrary stationary process admits decomposition 

at) = ri(t) + Ut), 

where rj{t) is a regular and ^^(t) a singular stationary process, rj{t) and 
4(t) are uncorrelated and are subordinate to ^{t). □ 

The following theorem is a continuous analogue of Theorem 2. 



Theorem 5. In order that a stationary process be regular it is necessary 
and sufficient that it be represented as 



t 



ri(t) = 



J a(t-s)C(*), 

- 00 



(17) 



where C (^y) is the standard process with orthogonal increments and 






00 

\a{t)\^ dt < CO . 

— 00 



In view of Theorem 4 of Section 7 this theorem is equivalent to the fol- 
lowing 

Theorem 6. A stationary process is regular if and only if it possesses the 
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spectral density f[u) and 



00 




— 00 



(18) 



We first show that a process admitting representation (17) is regular. 
For this purpose we introduce projection of the random variable 
rj{t) on — Since = the random variable 

rjT{t) can be written as 

t-T 

'?r(0= I 9{s)C(ds). 



On the other hand, the difference rj (t) — rjT{t) should be orthogonal to any 
variable cp e ^ {t — T) and in particular, to the variable il/ = C (A), where A 
is an arbitrary measurable set contained in { — co, t—T). Since 



C(^)= a{t-s)xA(s)ds- 



— 00 
t-T 



(p{s)XA(s)(ds) = 

00 

it follows that (p{s) = a{t — s), s^t—T. Thus 



[^a{t — s) — (p{sj] ds, 






a{t — s) C{ds) 



and 



\\rjT{t)\\^=E\r]T(t)\^ = 



\a{t — s)\^ ds = 



\a(s)\^ ds. 



Therefore 11^7(011^0 as T-^go which shows that the process r]{t) is 
regular. The converse assertion is more profound; it states that every 
regular process may be represented by means of formula (17) or, equival- 
ently, may possess the spectral density /(w) satisfying condition (18). 

To prove this assertion we shall utilize the results analogous to those 
obtained for discrete-parameter processes. 




304 



Chapter IV, Linear Theory of Random Processess 



Let ^{t) be an arbitrary stationary process and let 



00 



m= 



v{du) 



— 00 

be its spectral representation. Using the transformation w = tan 



e 

2 



we 



transfer the measure v from the whole real line (— oo, oo) onto the interval 
( — 7T, 7 t) and denote the transformed measure by v. Now let the stationary 
sequence with 



?(”)= 



I e‘"^v(dd). 

n 



(19) 



correspond to the process ^(t). We now utilize the following statement 
to be proved below : the process ^ (t) is regular if and only if the process 
f(n) is regular (Lemma 7). Therefore if rj{t) is a regular process then 
rj(') = K{f]) is a regular sequence. If f {9) is its spectral density then in 
view of Theorem 2 of Section 7 

n 

f inf(e)de> — 00 . ( 20 ) 



But then the process r]{t) also has the spectral density f{u) and moreover 
{l-hu^)f{u)= 7(0), 9 = 2 arctan u. Therefore (20) implies (18). The the- 
orem is thus proved. □ 

We now prove the assertion utilized in the proof of the theorem. 

Let (^(/) be an arbitrary stationary process and let ^(u) be defined by 
means of the correspondence K as given above (the equation above 
equation 19). 

Lemma 6. The equality Jf'^(O) = J^^(0) is valid. 

Proof. We show that J( - «) e (0), «>0 (for « = 0 this is evident). Note 

•n 1 “l~ iU 

that e"^ = . Therefore, 

1 —iu 



n 00 




— 7T — 00 



\ — iu 



1 + iu 



v{du). 
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On the other hand, 



1 — iu 2 

=-l+ =-1+2 

1 + m 1 + iu 



ds. 



fl-iuY 

Hence, function 7- can be approximated by a bounded sequence 

\ 1 -\-iuJ 

of functions of the form ^ ^^<0, uniformly convergent on an 

k 

arbitrary finite interval { — A, A). It thus follows that f( — n)G 
We now show that ^{t) belongs to ^^(0) for t<0. We have 



at) 



00 

= J* e“"v(t/«)= 



e‘®-l 



l-Qe~^ 



= lim exp < t 

eti J I 1 e 



v{d9) = 



since the integrand in the last integral is uniformly bounded for 0 < ^ < 1 
and r<0. 

On the other hand it follows from the equality 

l-\-Q e n = o 

that this integrand can be uniformly approximated (for fixed q) by the 

N 

functions of the form ^ Therefore for /<0. The 

k=0 

lemma is thus proved. □ 

Lemma 7 . If ^{t) = ^s{t) + r]{t) is a decomposition of the process ^[t) into 
singular and regular components, then equality ^f) = K{^^)-\-K{rj) is the 
decomposition of ^{n) into the same components. 



Proof Note that te{— 00 , 00 ) if! a jif^^{n), n = 0, 

±1,.... Indeed if (0) ci ^ JO), then c^|j0) = Jf’^j0)cz ^^^^^0) = 
= ^2 (^)’ hence {t) = (0) c= (0) = (t). Moreover in these 

relations the roles of the processes {ii{t), ^2(0) iiW) can be 

interchanged. Note also that since the measure v is subordinate to the 
process {^{t), —oo<t<co}, it follows that = Next let ^{t) be a 
singular process. Then Jf’| = ^^ = ^j0) = ^l(0). This means that ^{n) 
is a singular process. Analogously we obtain that the singularity of ^{n) 
implies the singularity of ^{t). Let ^(^) be regular. If {(n) were not regular 
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we would have the equality = + where the sequence ^i{n) 

is singular and completely subordinate to ^{n). The decomposition ^(t) = 
= ^i(f) + ^/i(f) will then correspond to this decomposition, where ^^(t) is 
as it has been shown, a singular process completely subordinate to ^ (t). 
However this implies that H = which contra- 

dicts the regularity of the process ^{t). Thus ^{n) is regular provided ^{t) 
is such. The converse is analogously proved. The assertion of the lemma 
follows from here. □ 

Results obtained for forecasting stationary sequences can now be 
carried over with certain modifications in the statement of theorems and 
the proofs to the case of continuous parameter processes. Here one uses 
the spectral representation of continuous parameter stationary processes 
and refers to the results of Lemma 2 and Theorems 3 and 4 of Section 7. 

For example, the statement of Lemma 4 is carried over verbatim to 
the case of continuous-parameter processes and only trivial modifica- 
tions in the proof are required. From this analog of Lemma 4, we deduce 
the following 

Theorem 7. In order that the process ^ (t) be non-determinate, it is neces- 
sary and sufficient that 

00 

f In / (u) du 

— ^>- 00 , 

J l+M^ 

— 00 

where f (u) is the derivative of the absolutely continuous component of the 
spectral measure F of the process i^{t). 

If ^{t) = f][t)-\-^^[t) is the decomposition of the process ^[t) into reg- 
ular and singular components, and in view of Theorem 5 

t 

— 00 

then 

t-T 

^rW= J a(f-s)C(rfs) + ^s(t)- 

— 00 

Moreover, the optimal mean square error of the forecast is determined 
from the relation 

T 

^^(T) = | \a{sfds. 



0 
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An alternative expression of the optimal forecast is given by 




where v is the stochastic spectral measure of the process ^ {t), 

00 T 

1 r 1 f 

h{iu) = a{s) e ds, hT(iu) = a{s) e 

\/^ i o 

The function h{iu) is determined from the spectral density / (u) by means 
of formula (23) of Section 7. 




Chapter V 



Probability Measures on Functional Spaces 



§1. Measures Associated with Random Processes 

Kolmogorov’s theorem on the construction of a probability space from 
finite-dimensional distributions of a random process with values in a 
metric space dC shows, in particular, how to construct a measure /i on 
a measurable space (J^, ®) - where ^ is the space of all the functions 
with values in ^ and © is the minimal a-algebra containing all the cyl- 
inders in such that, for any cylinder C, the value //(C) coincides with 
the probability that the sample function of the random process belongs 
to C. This measure is called the measure associated with (or correspond- 
ing to) the random process ^{t) and it can always be constructed irre- 
spective of the probability space on which the process ^(t) is defined. If 
the process co) is defined on the probability space {O, ®, P}, T is the 

T 

mapping of Q into determined by the relation , co) and So is 

a subalgebra ofS consisting of the sets of the form T~^C, where Cg©, 
then the measure // is the image of the measure P, which is the contrac- 
tion of the measure P on So under the mapping T, i.e. 

fi{C)=P(T-^C). (1) 

The measurable space (J^, ©) and the measure // defined on it can be 
conveniently utilized in the investigation of random processes for which 
only finite dimensional distributions are given. Using these two notions 
one can investigate the existence of processes with given finite dimen- 
sional distributions - with sample functions satisfying certain regularity 
conditions - as well as to study various functionals on sample functions 
of the process, transformation of random processes and so on. 

Consider measurable functionals of sample functions of a random 
process. We refer to an arbitrary random variable defined on the prob- 
ability space (J^, ©, //} as the functional on the random process ^{t). 
Sometimes it may be convenient to consider a somewhat wider class of 
random variables: i.e. random variables defined on {J^, ©, //} where 
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{S, fi} is the completion of the measure (®, jn}. Since for any random 
variable ^ defined on { p] a variable ^ on { /i} can be found 
such that ^ = ^(modp), the distinction is not a basic one. However, when 
constructing various specific functionals we may get ©-measurable func- 
tionals also. Obviously this can be avoided using a more elaborate con- 
struction - but this we shall not do. 

Any functional on a sample function of a process is determined by 
the values of the process. We show that this is true for the class of func- 
tionals introduced above. We call a functional /(x(*)) cylindrical if a 
Borel function /^(x^,..., in and points exist such that 

/(x(-))= /,„(x(/i), ..., x{t^). If fjn is continuous then the cylindrical func- 
tion is also called continuous. Clearly every cylindrical functional is S- 
measurable and thus determines a random function on S, ju}. More- 
over, the value of the functional is determined by the sampling function 
of the process 

The distribution of the variable coincides with the 
distribution of the variable /(x(*)) on I#', ©, ji]. It is natural to con- 
sider, as a functional on sample functions of a process ^{t), a random 
variable f] for which a sequence of cylindrical functionals f^^\x{')) can 
be found such that in probability as m->oo. In this case the 

sequence of functionals converges in measure /z to a certain 

©-measurable functional. This follows from the relation 



(this equality is a particular case of equation (1)) and the convergence in 
probability of /^""^ (<^ ( • )). We now show that for any ©-measurable 
functional / there exists a sequence of cylindrical functionals which 
converges in measure p to /.To prove this it is sufficient to show that 
for any ©-measurable set A there exists a sequence of cylindrical sets 

C„ such that If ®o is a collection of such sets then 1) ©o 

is an algebra, 2) it is a monotone class, 3) it contains all the cylinders, 
namely ©q is a n-algebra which coincides with ©. 

Note that if the process ^{t) is defined on a probability space {^2,S, P} 
and So is the n-algebra introduced above, then all the So-measurable 
random variables are functionals of c^(*). If rj{a)) is a So-measurable 
variable, then f]{co) = f{Tco), where / is a certain ©-measurable func- 
tional. We shall write rj{co) = , co)). 

It follows from the formula of change of measures in an integral 
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that 



E/(a-,®))= 



f(x)n(dx) 



( 2 ) 



for any functional / provided the integral on the right is meaningful. 

We now consider the problem of feasibility of constructing a measure 
on a functional space which is smaller than the space of all the functions. 
Clearly one can take an arbitrary S-measurable set such that 
jU ( J^o) = 1 consider a measure on J^o* However all the interesting sets 

of functions are not ©-measurable since every S-measurable set of func- 
tions is determined by the behavior of the functions on at most a countable 
number of points and this does not determine such properties as conti- 
nuity, differentiability, absence of discontinuities of the second kind, 
measurability, etc. 

Therefore the following approach to the construction of a measure 
on a functional space would seem appropriate. Assume that the space 

is such that for any cylinder the set C n is not empty. Then 

one can consider in minimal a-algebra Sq which contains all the 

sets of the form C n (we shall call them the cylindrical sets in ^q). 
Define on the cylindrical sets Cq in an additive set function //q(Co) = 
= li{C\ where Co = CnJ^o* Note that this definition is unique: if 
Co^CnJ^o ^nd Co^CiO^o^ Ih^n the intersection [(C — Ci)u(Ci — 
— C)]nJ^o is not void, which is impossible if C^C^. Clearly fiQ is an 
additive non-negative set function. A necessary and sufficient condition 
for extending JlIq to a measure on Sq is the condition that for any sequence 
Cq of cylindrical sets in such that U ^o = ^o inequality ^ 

n n 

^ /io(i^o)= 1 is satisfied. This requirement is equivalent to the following: 
^ 1 for any sequence of cylindrical sets C"e#' such that /^o(^o) 

n 

U 

n 

We define the outer measure /i* in terms of a measure as follows: 
for any set A 



//*(T) = inf{X/i(C”); uC”=)T}. 

n 

Then to construct a measure /Tq on Sq we must have: The 

measure jhq is then of the form jlIq {A) = jll*{A) for TgSq. To prove this we 
note that ^(S') = 0 for any ©-measurable set S such that Sn#'o = 0. 
Indeed otherwise ^ — S^ ^ud hence 

= — iS) = 1 — ju(S')< 1 . □ 




§1. Measures Associated with Random Processes 



311 



Clearly, the cr-algebra Sq consists of the sets of the form A n where 
^ is a S-measurable set. 

Let AQ = An^Q. Set jUo(^o) = A^(^)- The definition is 

unique since if Aq= An =A' n then {A — A') u{A' — A)e^ — 
which implies that fi{A) = fi{A'). Note that j^Q is a countably-additive 
measure on 95o* if pairwise disjoint and AQ = A^n^Q, then 

A^nA-^(=.^ — ^Q for k^j and fi{A^nA-^) = 0, i.e. 

/to (u = /t (u /l*') = X /i (^ *‘) = Z Mo (^o) • 

n n 

Moreover for any cylindrical set Cq, jiio (Cq) = fio (^o)* Hence, Jiq = ^q. On 
the other hand 

H*{Ao) = M{ii{A'), A'e‘&, A’^Ao}=fi{A), 

if Aq = Ah^q. 

Thus a measure associated with a random process can be considered 
on any set of functions #'q with outer measure 1 and this measure coin- 
cides on this set with the outer measure. 

What then are the measurable functionals on the space {J^o, ®o? Mo} ? 
We show that for any ®o‘i^^^surable functional /o(x) there exists a ®- 
measurable functional / (x) such that / (x)=/o(x) for xe^o- Let Eq’"' be 
the set in ®o defined by 

. \ k , , k + 1] 

£^o’" = |x.-</o(x)<^k n>0, /c = 0,±l,±2,... 



Denote by the ©-measurable set such that = The set 

can always be chosen in such a manner that the following condi- 
tions are satisfied : 

1) are pairwise disjoint for a fixed n (otherwise we take the sets 

k- 1 

E^,n^^k,n_ y ^j, n 

j = -oo 

2) = (otherwise we put £ 2 fc,n+i 

nElk^n+l^ E2k+l,n+l _Ek,n_£2k,n+ly fuUCtioU 

equal to /c/2” on the set and equal to +oo if x^U £^’”. With this 

k 

definition /^”^(x):^ 1/2” and hence /^”^(x) is uniformly 
convergent to a measurable function /(x). Moreover |/^”^(x)— /q(x)| < 
< 1/2” if xeJ^o- Therefore for xe^q. Hence fo{x)= f{x) 

for xe^Q. Let another S-measurable function f'{x) exist which coin- 
cides with fo{x) on J^o* Then the ©-measurable set (x:/(x)//'(x)} is 
disjoint from and consequently has /^-measure 0. Thus each ©q- 
measurable function can be uniquely extended (mod/i) to a ©-measur- 
able function. 




312 



Chapter V. Probability Measures on Functional Spaces 



The last consideration shows that in the study of functionals of 
random processes the transfer of a measure to a more restrictive space 
has no significant implications. However, functionals defined on 
often present a clearer picture. For example, if is the space of con- 
tinuous functions then the functional 

/o(x) = supx(r) 

t 

will be measurable on ^ 0 - This functional can be extended to ^ by 
means of the formula 



f(x) = supx(t), 

teN 

where is a countable everywhere dense set of the values of the argu- 
ment. Clearly the form of the first functional is more natural. 

The following observation plays an important role when one studies 
measures and the feasibility of their transfer to • i^^ many cases one 
can find a countable set N of values of the argument such that the com- 
pletion of the ( 7 -algebra - the minimal cr-algebra generated by cylin- 
drical sets determined by the values x(/) for teN - coincides with ©. 
For example, if the process is stochastically continuous then any every- 
where dense set of values of the argument can be chosen as N. The same 
is true in the case of a one-side stochastically continuous process. A 
necessary and sufficient condition for the existence of such a set can 
easily be given: 

Lemma. In order that a set N exist such that = © it is necessary and 
sufficient that the Hilbert space ^2 M ^-measurable functions square- 
integrable with respect to measure g be separable. 

Proof. If such a set N exists then the separability of ^2 (m) follows from 

the fact that ^lilA coincides with ^2(1^% where is the restriction 
of measure p on (The separability of ^2 (/^^) follows from the fact 
that bounded cylindrical functions are dense on it and, in turn, that 
continuous cylindrical functions are dense in the set of bounded cylin- 
drical functions.) 

Now let ^2 il^) be a separable space and /^ , /2 , . . . be a basis in if 2 (/^) 
For each ©-measurable function there exists a countable set Nj^ such 
that /fc is measurable with respect to ©^^^. This follows from the possi- 
bility of approximating in terms of cylindrical functions. In this case 
the union U can be used for N. □ 

k 

Note that the existence of an N such that ©^ = © should not be con- 
fused with the existence of an N such that the process possesses an N- 
separable equivalent. The construction of a separable equivalent corre- 
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spends to the transfer of the measure fi onto the set A^-separable 

functions. 

However to verify the continuity of the process it is sufficient to con- 
sider the values of the process on the set N not only in the case when 
the process is TV-separable but also in the case when = The first 
case is discussed in Chapter III, Section 5. As far as the second is con- 
cerned, we note that for the evaluation of the outer measure of the set 

it is sufficient to utilize cylindrical sets in 

In the conclusion of this section we state some general conditions 
which assure the construction of a measure on the sets ^ c: ^ and ^ cz 
where ^ is the set of continuous functions and ^ is the set of functions 
without discontinuities of the second kind. Clearly, in the first case the 
process should be stochastically continuous, while in the second it should 
have no more than a countable number of stochastic discontinuities. In 
both cases it is easy to find N such that = ® {N is a countable every- 
where dense set of values of the argument). The process itself is assumed 
to be defined on a certain compactum K {teK) in the first case and on a 
closed interval in the second. The evaluation of the outer measures of 
the corresponding sets is simplified by the fact that a minimal ©^-mea- 
surable set can be found which contains the sets ^ and 

Condition for the existence of a measure on In order that a measure ju 
be transferable onto ^ it is necessary and sufficient that relation 

/i(n U n |•x:(•):|x(^)-■x(s)|<H) =1. 

\r=l 1=1 lt-s|<l// t ^J/ 

teN, seN 

be satisfied. 

It is easy to verify that the set in the round brackets appearing under 
the sign of the measure /i is the minimal ©^-measurable set containing 
^ (to do this it is sufficient to consider functions x(-) defined only on N). 

Condition for the existence of a measure on In order that a measure 
be transferable onto Q) it is necessary and sufficient that 

u |x(-):|x(m)-x(0|<1|^ = 1. 

The fact that the set in the round brackets appearing under the sign 
is the minimal ©^-measurable set containing ^ follows from the fact 



iU n u n <x(*):|x(t) 



= 1 1 = 1 u,s,teN 

s<t<u<s+ 1/1 
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that the functions x(/) have no discontinuities of the second kind (cf. 
Chapter III, Section 4). 

§2. Measures in Metric Spaces 

In the previous section we pointed out the feasibility of transfering 
measure associated with a random process from the space of all func- 
tions to a certain smaller functional space the present section 

we shall concern ourselves with the case when is a separable metric 
space and c-algebra Sq coincides with the c-algebra of all Borel sets in 
To show that we will be dealing with an interesting situation, con- 
sider the case when coincides with the space ^ of all real continuous 
functions. Clearly ^ is a metric space with the metric ^(x, };) = sup|x(t) — 

t 

— j(r)|. If we consider processes defined on a compactum, then ^ will 
be separable. We show that ©o coincides with the cr-algebra of Borel 
sets. First we note that the cylindrical set {x{'):x{t)eA] - where A 

is a Borel set - is a Borel set in Therefore all the sets in ©o are Borel 
sets in To show that all Borel sets in ^ belong to a-algebra ©q it is 
sufficient to show that an arbitrary closed sphere in ^ belongs to ©q. 
Let 

5’={x(-):sup|x(r)-7(/)Ke} 

t 

be a sphere with the center at of radius g. Then 

s=^r)\n 

UeN 

where N is an arbitrary countable everywhere dense set in the domain of 
definition of the values of the argument. 

As we shall see in the sequel, one can introduce a metric in the space 
of functions with discontinuities of the second kind ^ such that ©o will 
coincide with the a-algebra of Borel sets in For future discussions the 
specific form of the space is irrelevant. We shall consider an abstract sep- 
arable metric space S’ with elements x, y, . . . and metric g (x, y). Denote 
by © the cr-algebra of Borel sets in S and let the measure fi be defined on 
©. If K is a subset of S, we denote by space of all continuous 

bounded functions defined on K. We denote the space simply by 
A natural metric in is 

QkU 9)=sup\f (x)-g(x)\. 

xeK 

Continous functions on S’ form a simpler but at the same time sufficiently 
broad class of functions; all the ©-measurable functions can be obtained 
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from these functions by means of the limiting operation. Therefore the 
measure ^ is completely determined by the values of the integrals 



f(x)fi(dx) for /G"^:taking a sequence such that P^Xa where 



Xa is the indicator of the set ^4 in ® one can determine fi{A). In many cases 
the measure /u is not given and it is unknown whether it exists. Only the 



values of the integrals L(/) = J 



/ (x) fi{dx) are assigned. The question is 



under what conditions does the functional L{f) defined on ^ admit repre- 
sentations in the form of the integral with respect to some finite measure? 
The answer in the case of a complete metric space is given by the following 



Theorem 1. In order that the functional L{f\ defined on the space ^ of 
continuous bounded functions given on a complete metric separable space 
^ admit the representation 



L(/)=j 



f{x)n{dx). 



( 1 ) 



where g is a finite measure on ©, it is necessary and sufficient that the fol- 
lowing conditions be satisfied 

1) L{fpOforallf^O; 

2 ) L(cJ,+C2f2) = C,L{fl) + C2L(f2); 

3) for any ^ > 0 there exists a compactum such that for any function 
f{x) for which f{x) — 0 for xeK^, the inequality 

\L{f)\^e\\f\\, 

is fulfilled where || / 1| = sup |/(x)| . 

Proof. Necessity. The necessity of conditions 1) and 2) is obvious. Since 
for / satisfying 3) the inequality 



f(x)fi{dx)= J f{x)n{dx) 






9C-Ks 



holds, to prove the necessity of condition 3) we show that for any s>0 
there exists a compactum such that — K^) ^ s. Let {x^, k = 1, 2, . . . } 

be a sequence everywhere dense in ^ and let 5^(x) be the closed sphere 
with center at x of radius r. For each r, an can be found such that 
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Set 



00 N 2 - n 

K= n u s2-„(x,), 

n = 1 k = 1 

then is a closed set and for each n it possesses a finite 2~”-net. Thus 
is a compactum. Next we have 

00 / N2-n \ 

^ ^ u S2-„(x,)k E 82-"=e. 

n=l \ fc=l / n= 1 

The necessity of the conditions of the theorem is thus proved. 

Sufficiency. Let F be a set. Set fi{F) = inf L{f), where the infimum is 
taken over all /^O such that f(x)^\ for xgF. We include Fin class ®o 
provided fi{F') = 0, where F' is the boundary of F. We show that ®o 
forms an algebra of sets and that ju is an additive function on ®o- To do 
this we note that and for AczB^ns it 

easily follows from the definition of jH. Next fi({^ — F)') = fi(F'), and 
/i((Fi uF 2 )')^/i(Fi'uF 2 )</i(F/) + /i(F 2 ), and thus the sets F such that 
ju(F') = 0 form an algebra. We now prove the additivity of ju on ^q. Let 
Fi and F 2 be two disjoint sets on ®q- first show that 

^ (Fi u F2) > /I (Fi ) + /i (F2) . 

Taking an arbitrary 8>0 we can find functions / and l^qy^O 

such that L{(p)^s, L(/)</i(Fi uF 2 ) + 8, (p{x)=l for xg[Fi] n [F 2 ] and 
f{x)^l for all xgFi u/ 2 . Now set 



/ib)= 



1 , 

(p{x), 



xe[Fi], 

xe[F2] 



and extend by continuity the definition of fi(x) over the whole space 
dC in such a manner that 0^ + (fi q{^) is an arbitrary 

continuous non-negative extension then fi{x) = min\_g{x),f{x) + (p{x)~\). 

Let /2 (x) =f(x)-\-(p (x) — /i (x). Clearly, /2 is a non-negative continuous 
function and / 2 (x)=/(x) = l for XGF 2 . Therefore 



/i(Fi) + /i(F2)^L(/0 + L(/2) = L(/)-FL((/>)^/i(FiuF2) + 2£, 
and since s is arbitrary 



FkO+Fki) ^^2)- 

The additivity of ft on ®o follows from this relation and the semi-additivity 
of fi. 

Note that in the case when [Fi]n[F2]=0, the function cp can be 
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chosen to be zero and therefore for such sets the relation 

fi(Fi^F2) = fi(Fi)+fi(F2). 

holds even if they do not belong to ©o- 

We now show that fi can be extended as a measure on o'(©o). For this 
purpose it is sufficient to prove that for any sequence of decreasing 
sets in ©q such that = ju(^„)->0 as n-> oo. Assuming that this is 

not the case one can find a sequence of sets e ©o such that fi (^„) > ^ > 0 
and n^„ = 0- We note that for any ^g©q and any e>0 there exists a 
closed set Fe©o such that F and + 

Indeed, let f{x) be a continuous function and let F^ = {x:f(x) = c}. 
Then for all c, except possibly a finite number of them, jii{F^) = 0 since 
for different the sets are closed and are pairwise 

disjoint and hence ^ = F^)^ft{^). 

i 

Let f{x) be the function defined as f{x) = l for xe[^ — ^], /(x)<l 
•forx^^[^-^] and/i(^-^)^L(/)-^. 

Let X<\ be a number such that /i(F^) = 0. Denote by S the set 
(x: /(x)> A}. Then 

Since 5 g©q, the set ^ — S={x: f{x)^X} is a closed set belonging to ©o 
and 

1—X S 

It remains to choose A close to 1 in such a manner that -^)-\ 

-A 2A 

be less than s. 

Now let Fk be closed sets in ©□ such that F^^^k F{^k)^i^{Pk)F 

d ” ^ 

F„= n f*. Then 

^ k=l 

k=i k=i 2 2 

Therefore a decreasing sequence F„ of closed sets belonging to ©q has 

3 

been constructed such that fi{F^)^- and H F„ = 0. Using condition 3) 

2 n=l 

we now choose a compactum K such that for all /(x), satisfying /(x) = 0 
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for xeK, L(/)<- ||/||. We show that the intersection of with K is 

non- void for all n. Indeed, if = then a continuous function g{x) 
can be constructed such that 0^g{x)^l, g{x) = l for xeF„ g{x) = 0 for 

3 3 

xeK. Then /t(F„)<L( 0 )^ 11^ II - which contradicts the construction of 

4 4 

Hence the sequence of non-empty compacts sets K„ = F„nK satisfies 
conditions + i and nK„ = (D which is impossible. From the contra- 

diction thus obtained the countable additivity of fi on ®o follows as well 
as the feasibility of its extension over Denote the measure obtained 

on o-(®o) by We now recall that for almost all c the set {x:f{x) < c} e ® o 
for an arbitrary continuous function f. Therefore {x: /(x)<c}go-(®o) 
for all c provided / We thus obtain that o-(®q) contains ®. Finally we 
show that equality (1) is satisfied. Let and Co<0<Ci<... 

... <c„ be such that the sets {x: /(x) = cJg®o- Then for any 

e>0 continuous functions (pj^(x)^0 and (pj^{x) = l for xeEk — {x:Cj^< 

£ 

< /c = 0, ..., n—l, can be found such that L{(pj^)<fj,{Ek )-\ — . 

n 

Therefore 



L{fHL 



n- 1 



Yj ^k+l^k 
.k = 0 



n- 1 

< Z ^k+lF{Ek) + £^ 

k= 1 



< 



f{x)n{dx) + e + max (c ^ + 1 - c „) . 

k 



Since 8 -t-max(c;, + i — can be chosen arbitrarily small, it follows that 

k 

f(x)n{dx). 

Analogously, 



L(1)-L(/) = L(1 




f{x)fi{dx). 



Since L(l)=/i(^), we have -L(/)^-J f{x)n{dx). 

Thus equality (1) and the theorem are proved. □ 

Remark 1. It follows from condition 3) of the theorem that for any finite 
measure /z on S and for any s>0, one can find a compactum K such 
that -K)<s. 

Remark 2. It is easy to see that the completeness was not used in the 
proof of the sufficiency part of the theorem. However completeness is 
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essential in the proof of necessity. In the case when the space ^ is a 
Borel set in its completion S the conditions of the theorem are also 
necessary. Then for any ^ > 0 a compactum K can be constructed in S 
such that and /i(^ — K)<e. 

On the other hand if ^ is not complete then the proof of necessity 
of condition 3) as given above implies the existence of a compact set K 
in ^ (or completely bounded K in SC) satisfying — K)<£. Considering 
the functional L(cp) only on those functions <p which can be extended by 
continuity to the whole space (these are the functions uniformly con- 
tinuous on each completely bounded set K in we can construct a 
measure jl on (^, ®) where © is a cr-algebra of Borel sets in the space 
If it turns out that ^ viewed as a subset of ^ has an outer measure which 
coincides with the measure of the space ^ (i.e. with T(l)), then the 
measure jl can be transferred onto ^ as was indicated in the previous 
section. In order that the outer measure of the space SC be equal to L(l) 
it is sufficient that for any sequence of non-negative continuous functions 

00 00 

(pn such that ^ (^„(x)^ 1, and for each x the inequality ^ L{(p^'^L{\) 

1 1 

be satisfied. The latter assertion follows from the fact that the outer 
measure of any set ^ in ^ can be defined as inf ^ L{cp^, where the infi- 

n 

mum is taken over all sequences of non-negative continuous functions 
on S such that 



Y<Pn{x)>l, xeA. 

The stated condition is equivalent to the following : for any monotonic 
sequence of non-negative continuous functions q>„ in ^ such that (p^{x)lO 
as «-^oo for all x, we have lim L{(p^) = 0. It follows from the Lebesgue 

n-*^ ao 

theorem on monotone convergence that this condition is also necessary. 
Therefore we have for the case of an incomplete space the following 

Theorem 2. In order that a functional L{f) defined on the space ^ of con- 
tinuous functions given on a metric separable space SC admit representation 
(1), where p is a finite measure on SC it is necessary and sufficient that con- 
ditions 

1) L{f)^Oforallf^O; 

2) L{Cifi+C 2 f 2 ) = CiL{fi) + C 2 L(f 2 ) for all real and C 2 and f^, 

/2e'^; 

3) for uny decreasing sequence of non-negative functions (p„e^ such 
that (p„(x)-*0 for all x, L(^„)->0, 

4) for any e>0 one can find a completely bounded set K such that 
l^(/)l^e ll/ll for allfe^ satisfying f(x)=0 for xeK. 
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In the conclusion of this section we consider the problem of deter- 
mining integrals of continuous functions in the case when dC is the space 
of continuous functions on [a, b] with the metric q{x, >;) = sup |x(t) — y(r)| 

t 

and the measure ju is a measure associated with a certain random process. 
We shall assume that partial distributions of this random process are 
known and let be the joint distribution of the values 

of the process at points Denote by a a certain subdivision 

{a = t^Q^ < ... <bn^ = b} of the segment [a,h],|a| = max|t^‘^|i — 4“1- 
Let Set * 

t — 

h+ 1 h 
for 






Clearly, x^(t) is a piecewise-linear function which coincides with x{t) 
at the points of subdivision a. If then ^(x(-), x^(-))-^0 as |a|->0. 

Let / be a certain continuous functional. Then for all 



f(x)= lim f{x^). 

Denote / (xj= ^(x). The functional ^(x) is a continuous cylindrical 
functional. If |1/|| <oo, then ||/^|| ||/|| and in view of Lebesgue’s theo- 

rem on bounded convergence 



f{x)fi{dx)= lim f^{x)i^{dx). 

J l«l-oJ 



( 2 ) 



Functional f^(x) is of the form ..., x(4“0). Therefore the inte- 

gral in the r.h.s. of (2) can be evaluated by means of finite dimensional 
distributions : 

Formulas (2) and (3) allow us to define integrals of continuous functionals. 



§3. Measures on Linear Spaces. Characteristic Functionals 

Let I' be the real line. Then the space ^ of all functions x(/) defined on 
a certain set T and taking on values in ^ is a linear real space. Denote 
by L the space of all linear functionals / on of the form 

n 

Z Cfcx(rfc), 
k= 1 



(1) 
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where n is an arbitrary positive integer, {ti,..., is a set of points in 
the domain of the definition of the process, are real numbers and the 
(7-algebra ® defined in Section 1 coincides with the minimal tj-algebra 
with respect to which all the functionals in if are measurable. The mea- 
sure on ® is completely determined by its values on the sets of the form 
{x:l{x)<(x} for all possible /. This follows from the fact that knowing the 
values of measure n on these sets one can evaluate the integral 

J e“ jx (dx) = E exp |i (f^) j , (2) 

where (^(^) is the random process associated with measure fi; consequent- 
ly, one can compute the joint characteristic function of the variables 
which will allow us to obtain the joint distribution of 
..., ^{t„) for any selection of the values of the argument. Thus if 
integral (2) is known, one can determine the marginal distributions of 
the process <^(/), which in turn completely determine the measure ii. 

In the case when measure ji is transferred from into a smaller 
space #'o, the space often turns out to be a linear space. At least 
linear functionals of form (1) are defined on this space and the (j-algebra 
of ® o‘i^^^surable sets in will also coincide with the minimal (j-algebra 
with respect to which all the functionals of form (1) are measurable. It 
follows from the above that in order to define a measure in such a case, 
it is sufficient to know the distribution of all linear functionals on • 
The specific form of the space is often irrelevant when studying measures 
on various linear functional spaces. Therefore the following scheme will 
be adopted. Let ^ be an arbitrary linear space (over the field of reals) 
and if be a linear set of linear functionals /(x) defined on Denote by 
® the minimal a-algebra with respect to which all the functions l{x)eS^ 
are measurable. We shall consider probability measures // on The 
measure ju is completely determined by its characteristic functional 



x(0= 



H{dx). 



( 3 ) 



We now prove this assertion. We shall call any set of the form 
{x:l^(x)€A^,...,l„(x)eA„}, 

where n is a positive integer, Z^,..., /„ are functionals in if and 
are Borel sets on the line, a cylindrical set in 3C. Let be the algebra of 
all cylindrical sets. Clearly, every functional /e if is measurable with 
respect to Sq so that cr(So) = ®- Therefore it is sufficient to define the 
measure ju on Sq. If the functional (p{x) defined on ^ is of the form 
(p{^) = g{li (^), • • • , L{x)) where n is an integer, /j, if and g{Si,. ..,s„) 
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is a Borel function of n variables, then we call q> a cylindrical function 
and term it continuous cylindrical if (p is continuous. To define measure 



jj. on 0Q if is sufficient to know the integrals J (p[x) fi{dx) for continuous 

bounded cylindrical functions <p, but these functions are limits of an 
everywhere convergent jointly bounded sequence of trigonometric poly- 
nomials of the form 

N f ^ 1 

T (x) = E exp ] i Z hjh (^) f • (4) 

k=l I j=l ) 

It remains to remark that formula (3) determines the values of the inte- 
grals of function T (x) of the form (4) 



J T (x) /X (dx) = CkX y Z ^kjhj • □ 

We now investigate the degree of arbitrariness of a characteristic 
functional %(/): 

1) A characteristic functional should be positive definite: for any 
/i,..., /„ belonging to ^ and any complex numbers ai,..., a„ 

n 

E (5) 

k,j=l 



k,j=l J \k=l 

2) Moreover, the functional x(t) must be continuous in the following 
sense: define if /„(x)^/(x) for all xg then xiQ^xil) as /„ /. 

Now let a functional /(/), which is positive-definite and continuous 
in the above sense, be defined on if. Are these properties sufficient for 
the existence of a measure such that formula (3) is satisfied? Note that 



for any /j, /„eif the function <^(si,..., = ^ is a charac- 

teristic functional in arguments 5^, S 2 ,..., 5„. Therefore a distribution 
P/i,..., iJ^ n-dimensional space exists such that 

Define a set function by the relation 

li{{x:li{x)eAi,..., /„(x)e^„})= P,, tjdui,..., du„). 
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It is easy to show that when the same cylindrical set is expressed in 
several different ways we obtain the same expression for the function fi. 
The function fi will be additive on Sq be extended to a countably 

additive function on each of the a-algebras (3[i’ - the minimal a-al- 

gebra with respect to which the functionals are measurable. 

Therefore, for any bounded Borel function u„) and any /i,..., /„ 

one can define the integral 

g(li{x),..., l„{x)n(dx). 

In particular, 



J g{dx) = x{l). 

Therefore, one can always construct, given x{t), a finite-additive set func- 
tion /i which is countably additive on each cr-algebra such that 

equality (3) will be satisfied. Simple examples (to be given in Section 6) 
show that jU is not always countably-additive on and hence it cannot al- 
ways be extended to a measure defined on ®. However, one can always 
construct a certain extension ^ of the space ^ in which such a measure 
fi exists; moreover ^ will also be linear and functionals in if can be 
extended on 3t in such a manner that they will be linear on #. We shall 
show how this can be accomplished. 

Let denote the space of all numerical functions (p(l) defined on 
^ (these functions may admit also infinite values but of one definite 
(fixed) sign). Define a real random function ^{l) on if such that for any 
choice of /i, . . . , /„ in if the joint distribution of the variables ^ (Z^), . . ., ^(/„) 
will be given by the following characteristic function: 




It is easy to verify the compatability of the corresponding distributions 
so that the existence of the random function ^{l) follows from Theorem 2, 
Section 4, Chapter I. Let be a measure on corresponding to <^(/). 
We transfer the measure /i to a smaller space. 

Denote by the set of all linear functions >l(Z) on such that 

2(Ci/i +C2/2) = CiA(Zl)-fC22(/2) 

for all Zi, /2e if and all real and C2. 

We show that the outer measure of the set is equal to 1. 

Let be an arbitrary decreasing sequence of cylindrical sets in 
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for which C) S„n is void Without loss of generality one can assume 

n=l 

that the sets are determined by the values of the function (p at the 

points /i,..., where {4, k=l, 2,..,} is a sequence of functionals. We 

shall write in order all the linear relations satisfied by the functionals: 

n 

Yj ^njk = ^ 

k= 1 

(if l„ is linearly independent of /i,..., /„_i, then the coefficient = 
otherwise / 0). 

Let D„ = {q>: ^ c^k(p(k) = ^}- Then 

00 00 

9= n S„nA^= n (S„nDin...n£)„). 

n=l n=l 

Since S„nD^n...nD„ is a decreasing sequence of cylindrical sets it 
follows from the relation 



0= n [S„nDin...nD„] 

n= 1 

that: n...nD„^0 as n-»oo. Finally note that jl{D„)=l for all n 

and therefore n n . . . n D„) = [i{S„) and hence /I This means 
that the outer measure of is 1. Hence the measure jl can be transferred 
onto A^. Next let Xq be a linear manifold in for which l{x)=0 for all 
xeXq, le^ and X^ be the quotient group of ^ hy Xq. Each element 
x^eX^ can be considered as a linear functional on if :x^(/) = /(x) where 
X is any representative of the residue class of x^ modulo Xq. Denote by 
# the set of pairs x = (x; 2) where xeXq, XeA^. Let P be a linear oper- 
ator which maps ^ into Xq such that Px =x for all xgXq, and x^(x) 
denotes the residue class in to which x belongs. Then there exists a 
natural embedding of ^ into # : 

x->(Px, x^(x)). 

Define on ^ a (7-algebra © of sets of the form 
§ = {x = {x;X):X€^}, 

where ^ is an arbitrary subset in 91 and 91 is the (j-algebra of subsets in 
A^ on which the measure fi is defined. We next set — and show 
that this is the required measure. Note that functionals / can be defined 
on ^ by the formula 

/(x) = /((x;2)) = 2(/). 

This functional is linear and it coincides with /(x) on ^ viewed as a subset 
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of $ : l{x)= l{{Px; x^ (x))), since x^ (x) as an element of is determined 
by the formula x^ (x) (/) = /(x). Finally consider the integral (3). From the 
construction of measure jH on cylindrical sets we have: 



fi{dx) = 



jl{dX)= E = 



Thus /i is the required measure. 

It is especially simple to construct S' in the case when Xq is the single- 
ton {0}. This will be the case if the set of functionals if is so large that 
for each pair x^ #X 2 in ^ a functional / can be found such that /(xi)/ 
^ I (X 2 ). In this case the space can be chosen for S, where each element 
xeS determines an element in A^ by means of the formula 



x(/) = /(x). 

Clearly, every measure on ^ determines a characteristic functional %(/) 
on if however the functional is not necessarily continuous in the sense 
indicated in condition 2). In order that condition 2) be satisfied it is nec- 
essary and sufficient that measure jl possess the following property: for 
any sequence l„ such that /„(x)-^0 for all xeS, /„(x)->0 in measure /t. 
Note also that if S' and if are chosen in such a manner that the space 
A^ coincides with S then the constructed measure will be a measure 
on S. 

Now let ^ be a linear normed complete separable space. It is natural 
to choose for if the space of all continuous linear functionals (the 
elements of which will be denoted by x*). The minimal (7-algebra with 
respect to which all the functionals x* (x) are measurable, coincides with 
the (7-algebra ® of all Borel sets in S. Every probability measure /x on S 
is determined by its characteristic functional 

/ (^*) = J M {dx ) . 

This characteristic functional will be positive definite and weakly con- 
tinuous on ^*. If a functional /(x*) possessing these properties is given, 
one can then construct a finite-additive measure on the algebra 
all cylindrical sets. In the case when is a separable space we shall find 
a necessary and sufficient condition for this measure to be extended to 
a countably-additive one on ®. The requirement is that for any 

sequence S„ of cylindrical sets satisfying the condition n = 0 and 

n 

Sn^S„+i. Let the set S„ be of the form (x:(/i(x), ..., /„(x))gT”}, where T” 
is a Borel set in We call closed if is closed in It turns out 
that it is sufficient to verify the continuity only on closed S„ since for any 
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£„>0 one can find a closed F"c:yl” such that 

Utilizing this observation we find the condition for the existence of mea- 
sure /i on 93 with a given characteristic functional. 

Let {x*(x), /c== 1, 2, ...} be an everywhere dense set on the unit sphere 
of the space Since |x| = supx*(x), the relation 

k 

0= lim /x({x: |x| >N})= lim lim ju({x:supx* {x)>N}) 

iV-^00 N-*cc n-^co k^n 

is satisfied, provided jll is a, countably-additive measure. Therefore in 
order that /i be countably-additive it is necessary that condition 

lim lim /i({x: supxJ(x)>N}) = 0 (6) 

iV^oo n~* CO k^n 



be satisfied. 

We now show that this condition is also sufficient. Let // be a additive 
measure on cylindrical sets constructed by means of x- It follows from 
(6) that for any a > 0 an AT can be found such that for all n 

fi{{x:Xk{x)^N, /c=l,..., 8. 

Assume that for a certain sequence of closed cylindrical sets S„ = 
= {x:(x*(x), ..., x*(x))gF”}, + the relation ju(S„)^28 is satisfied. 

We show that n S„ is not void. Let 

K^ = {x:\x\^N}, X” = {x:x^(x)^iV, /c=l,...,n}. 

The intersection S„ n is not void. Indeed if S„ n is void then 
d= inf |x| > A (since the infimum is achieved). Set 

XeSn 

M„) = inf{|x|:xt(x)=ui,..., x*(x) = «„}. 

Then the set is contained in the set {x:^(x*(x), ..., x^{x))^d}. The 
set {{ui, u„):N u„)<d} is open in and is the difference 
of two simply connected regions. Therefore a polyhedron exists with 

faces of the form u„): ^ = m, such that the set 

{(wi,..., u„):g{u^, u„)^N} is totally contained within the polyhedron 
and the set u„)^d} outside of this polyhedron. Then 

the set is contained entirely in the union of the sets 
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and each one of the summands is disjoint of Denote 



3^*W = Z >-ijxf{x). 



The set {x:yf(x)^bj} is disjoint of if and only if bj> ||};*|| N. 

Let - = zf. Then S„ is entirely contained in the set U {x:zf(x)^ 

lly*li j=i 

^N + 3}, where ^ = inf b: — N>0. 

j 11}^* II 

It follows from the continuity of x(^*) that for almost all a^, 
lim n{{x:zl^{x)<cci,..., z* tW<«m}) = 

fc-*’ 00 

= l^{{z*(x)«Xi,...,z*{x)«x„}), 



provided only — zj'H^O. We choose in the set{xj, /c= 1, 2, ...} a se- 
quence xjkj=l,...,m such that ||xj; ^ — zj || -^0. Then 



{x:zf{x)^N-hd}j = 

= 1 — /i({x:supz*(x)<N + <5})=^ 1 — lim ij.{{x: sup xfk(x)^N})^e. 

j^m k-^ao j^tn 

This contradicts the inequality 2e. Thus n is not void. There- 

fore the sequence of imbedded weakly closed sets n belongs to the 
weakly compact set and therefore Pi n Kj^} is non void and hence 

n 

P Sn is not void. Consequently, if is a decreasing sequence of cylin- 

n 

drical sets such that P S„ = 0, then lim ju(S'„) = 0 i.e. /x is countably addi- 

n 00 

tive. We have thus proved the following 

Theorem. In order that a continuous positive definite functional be 
the characteristic function of a measure on (^, ®), where is a Banach 
space such that is separable, it is necessary and sufficient that a count- 
ably-additive measure /x generated by the functional satisfy condition 

(6) for a certain set {x*, k=\, n, which is everywhere dense on the 
unit sphere of the space 

Remark. Condition (6) can be replaced by the following: let /z„(x) be a 
sequence of continuous cylindrical functions such that lim h„(x) = \x\. 
Then 



lim lim /x({x:/z„(v:)> A^}) = 0. 

N -* oo n-> 00 



( 7 ) 
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Condition (7) becomes (6) if /z„(x) = supx*(x). Moreover (6) can be ex- 

k^n 

pressed in terms of x{^*) using the inversion formula. 



§4. Measures in Spaces 

Spaces 6] of real measurable functions x(^) defined on [«, such 

that 



|x(t)|^ dt<co 



serve as an important class of linear normed spaces. We shall consider 
only the case when p^l. Let a certain probability space {Q, 0, P} be 
fixed. We shall study the conditions under which a measure in is 
associated with a given numerical process co) defined on [u, h]. As- 
sume that (J(t, cd) is a measurable process. Then in view of Fubini’s theo- 
rem co) as a function of t is measurable with probability 1. Therefore 
with probability 1 the integral 

b 



m, ojydt 



is defined (this integral may also admit infinite values). Moreover the 
integral is also a measurable function of co. 

Assume that 

b 



P< \^{t,co)\P dt<co>=l. 



(1) 



We show that under this condition a measure can be constructed in 
space ifp which is associated with the process co), i.e. a measure ju 
such that for any Borel set B of the space ^p 

^liB)=P{{oJ:i{^,a})eB}). ( 2 ) 

Relation (2) can be taken as the definition of measure provided we 
prove that {co:^(*, co)sB}eS for Bg®. To show this consider the class 
L of functionals defined on of the form 



l{x)= l{t) x{t) dt. 
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where l{t) is a bounded measurable function. Functionals in L are defined 
on for any p^l and L is dense in the space of linear continuous 
functionals for any p. Denote by ®o the set of all those such that 
{(o:^{', (o)eB}e(Z. Clearly ®o is a <T-algebra. Since /(f (•, co)) is measur- 
able with respect to S for any I in L, then for each continuous linear 
functional I on ^p the variable /((^(-, co)) is S-measurable. Therefore ®o 
coincides with the minimal c-algebra with respect to which all continuous 
linear functionals l{x) on are measurable and this (7-algebra coincides 
with ®. Thus relation (2) indeed defines a certain measure jll. Since for 
any continuous functional / on the variable /(^(-)) is measurable with 
respect to S, the measure p can be given by means of the characteristic 
functional 

;((/) = I M ^ = E gii (i (. , o») (3) 

This characteristic functional uniquely determines measure p. 

The question arises - how is the constructed measure p connected 
with the marginal distributions of the process co)? In other words, 
given marginal distributions can one construct a measure p and con- 
versely given the measure p can one determine the marginal distributions? 
We now show that for stochastically continuous processes the answer is 
positive. 

Let (^(t, 0 )) be a stochastically continuous measurable process (it fol- 
lows from Theorem 1 of Section 3 in Chapter III that for a stochastically 
continuous process there always exists an equivalent measurable process). 

Set 

f X, 

Then 

b b 

«)l j* 

a a 

as iV -^00 almost for all co. The process col) is also stochastically con- 
tinuous. 

We show that 



b 




n— 1 

dt =lim Y, 

A-^0 k = 0 



(4) 



where a = tQ<ti< ... <t„ — b, = — X = ma.x Atj^, and the limit is 

k 
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taken in the sense of convergence in probability. We have 






^ Z w)\ dt^ 






^s{b-a)-\-2N{b-a) sup[P{|^^(t, co)-c^jv(5, oj)\>s}; |t-s|<;i] 

for any g>0. Approaching the limit as and taking the stochastic 
continuity of into account (and hence the uniform stochastic con- 
tinuity) as well as the fact that £ > 0 is arbitrary we obtain the validity of 
equality (4). 

Similarly, 



(w)|'’*= lim lim ^ At^. 

N-*oo A->0 k-0 



holds and hence the fulfilment of condition (1) can be checked by means 
of the marginal distributions of the process If condition (1) is 

satisfied, then, using equality 



l{t)^{t, Q))dt =\im lim ^ l{t^) w) At,,, (6) 

N^CO A-+0 fc = 0 



which is valid for any continuous function l{t) on [u, h], we can define 



x(l)= lim lim exp<i ^ co) At,,} (7) 

JV-oo A->0 (. k = 0 ) 

for all continuous l(t). In view of the continuity of /(/) (7) determines the 
value of x{l) on the closure of the set L of all the functionals 



l{x)= l{t) x{t) dt 




§4. Measures in Spaces 



331 



with continuous l(t). And since L is everywhere dense in the space of all 
continuous functionals on it means that x{l) is completely determined 
by relation (7). 

Assume now that for a measurable stochastically continuous process 
(^(t, cd) on [a, h] condition (1) is satisfied and the measure jn on is 
defined or equivalently the characteristic functional %(/) is defined. Let 
l{t) be a certain continuous function, N>0 and n be a positive integer 
Set 



tnk = a+-(b~a), 

tnk+ 1 

"~^b-a /If \ 

^./v(0= Z Kt«k)9N\- — 

k = 0 n Vnk+l~^nk J / 

tnk 



For almost all cd 



b 



lim 4,jv(0 = 



l{t) ^^(t, (d) dt. 



Therefore for a characteristic functional Xn( 0 process cd) in 

the following relation is valid 



b 



Xjv( 0=E exp 



i J l{t) co) dt 

a 



= lim 

n-^co 






where l{t) is an arbitrary continuous function and 



tnk + 1 

— Cl [ ^ r \ 

Ktn,k) 9 N \7 x{t)dt\ 

k=o n \b-a J / 

tnk 

is a continuous and therefore ©-measurable functional on ^p. By 
continuity the functional Xn{^) t>e extended on the whole L. We now 
show that 



t+h 

U 






t 



( 8 ) 



in probability as /i ^ 0. Indeed 
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co)- (o)) dt 



<8 + 2AT sup[P{|(^jv(s, co) — ^n{U ®)I>£}; t<s<t + H]. 



The last expression can be made arbitrarily small by the proper choice 
of £>0 and h>0. 

Denote by the functional on defined by the equality 



tk + h 




tk 



Then it follows from (8) that for all real Uj^ and all points < ... in 
[a, b) 



E exp 



k=l ) h^O 



z 

fc=l 



kHk, h 






( 9 ) 



Relation (9) determines the marginal distributions of the process co). 
Approximating the limit as iV oo we thus obtain the marginal distri- 
butions of the process ^{t, co). Formulas (5) and (7) are inconvenient 
because the truilcated process cu) rather than the original process 
^ (t, o)) appears in these formulas. Moreover, it is desirable to obtain a 
condition sufficient for the fulfilment of relation (1) in terms of the 
probabilistic characteristics of the random variables rather than in terms 
of the limit of random variables themselves. To obtain simpler statements 
regarding the integrability in the p-th degree of a process we need the 
following 

Lemma. Let ^{t) be a measurable stochastically continuous non-negative 
process defined on [a, b~\. For any sequence of subdivisions of the interval 
[a, b] a = < • • • < tnn = ^ for which = max ^ 0 and any ran- 

k 

dom variables independent of ^ (^) and uniformly distributed on the in- 
tervals t„k+i^ respectively, k = 0, 1, — 1 the relation 

b 

Z (* i(t)dt 

k = 0 J 



is satisfied in probability as n^oo (the integral on the right may take the 
value -h 00 j . 

Proof. Since the process ^ {t) can be considered separately on each one 
of the sets 
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co: 



^{t)dt<co> and <(d: \ ^{t) dt= + cc>. 



it is sufficient to consider the two cases when only one of these conditions 
is satisfied with probability 1. First let 

b 

p|| ,J(t)*=+oo| = l. 



Since as it was shown above 



fc = 0 J 



in probability, and hence for each c> 0 the relation 

lim P j Z 

n^oo (,k = 0 J 

^lim pjz <^;v(Tjzlt„t^c|>pj I 

n-*ao U = 0 j U 



is satisfied. «-i 

Approaching the limit as N->cc we verify that ^ At„k con- 

verges in probability to + oo. 

Now let 



Set 




= 1 . 



= a^t^b, if j^(t)dt^m. 



(^'”(t) = 0, a^t^b, if ^{t)dt>m. 



b 

■ " I 
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The process is non-negative, stochastically continuous and measur- 
able. 

Since 



r(t) dt^m, 



then in view of Fubini’s theorem the expectation exists for almost 

b 



all t and Set (0 — (0)- The process is 



stochastically continuous and bounded by N. Hence for every e > 0 an 
h>0 can be found such that 

provided |f — s|^h. 

Utilizing this fact, we obtain 



hmE r{t)dt- X 



ink + 1 ink + 1 



lim X 



'^tnk 



"-1 1 

< lim Y, 

n~* ao k = 0 ^^nk 



[E|^;;(r)-^;;;(s)i + 



+ 2E|^;j(r)-^'"(0l] dt ds^ 

< lim sup E 1^^ (f) - (s)| (b-a) + 

n-^ GO p — s|<An 



+ 2 E|^^(f)-rWMf = 2 mUt)-r{t)\dt. 



The last expression tends to 0 as iV oo. Finally 



Pi ^{t)dt- Y U^nk)^tnk >8 
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D 

I' 






k = 0 



Approaching the limit first as n oo and then as m ^ oo we obtain the 
proof of the lemma. □ 

Corollary. Under the conditions of the lemma the non-random points 
^nk^[ink^ ^«k+i] exist, such that 



k = 0 

in probability as n-^oo. 



j mdt 

a 



Remark. If for some sequence of subdivisions of the interval [a, b] of 
the form given in the lemma and for some choice of points 

n- 1 

independent of ^{t) the quantity ^ is bounded in 

probability, then 



p mdt < 00 >= 1 . 



p 



Indeed, for any e > 0 and c> 0 

b 



^^{t) dt>c>^ lim P 



n-l 



Z ^N{s„k)^t„^>C-S 
k=0 



< 



n-^oo tfc=0 



^lim P<^ Y. ^{^nk)^t„k>C-8 



Hence also 



0 

1' 



n- 1 



P<j I i{t)dt>c\^\imP<Y. i{^nk)^t„t,>c-s\. □ 

n-*^oo ffc = 0 



We now present a condition necessary and sufficient for finiteness of 



the integral 



dt with aG(0, 2] expressed in terms of the charac- 



teristic functional of the process. Since we don’t know as yet in which 
space the process can be considered we utilize the characteristic func- 
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tional determined by the relation 

b 



Xo(0)=Eexp 



^{t)dg(t) 



for any step function g(t) defined on \_a, fo]. Clearly defining Xoid) is 
equivalent to defining the marginal distributions of the process. 

We construct a random function v“(t) defined on [a, fo] as follows. Let 
/c 

tnk = a-\~- (b — a), k = 0,,.., n, rjQ, be random variables inde- 

n 

pendent of ^{t) each one of which is uniformly distributed on [0, 1] 
(otherwise, the joint distribution of rjj^ can be arbitrary). Finally let the 
variables which depend neither on ^{t) nor on be 

independent and identically distributed and moreover E = c i.e. 
Ck has a symmetric stable distribution with index a. Set v“(a) = 0, v“(t) is 
constant for {t — a) ne [(/ + rjj) {b — a), (/ + 1 + 1 ) (fe — aj] and 

v“ (a +"^ (6 - a) + 0^ - v“ (6 - a) - 0^ = Cj/n ' . 

These conditions uniquely determine v“(t) (except at the discontinuity 
points). Moreover v“(t) is a step function with probability 1 since expres- 
sion is determined with probability 1. 

b 

Theorem. In order that the integral J l^(0r^^ some ae(0, 2]J be 

a 

finite with probability 1 where ^{t) is a stochastically continuous measur- 
able process it is necessary and sufficient that for all k>Q the limit 

t/^(A)=lim Exo (K), 

n-*^ oo 

exists satisfying condition (0 + ) = 1 . Moreover 

b 

il/{X)=Eexp^—f—^ |^(f)r dtj. 

a 

Proof. Denote by 91 the tr-algebra generated by the variables ^(t), 
fo] and f/fc, fe = 0, .... Under the conditions of the theorem, are 
independent of this a-algebra. Therefore 
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( iAr"-i 

= exp< 1 

(, n u = o 

It follows from the lemma that 



a + ^^(h-a) 
n 



b 



n k=o 

in probability. Therefore 



Ha + '^ik-a) 
' n 



mrdt 



\a n- 1 



J V 

expi -—I 

^ k = 0 



q a 3 (b-a) 



■exp< - 



iAr 

b — a 



i^(t)r dt 



also in probability (we define e °°=0). Since the quantities under 
consideration do not exceed 1, it follows that 



l.mE expo's E 

n k=o 



^a + '^(b-a) 
' n 



= E exp 



iAr r 

b-aj 



latTdt 



also. 

Moreover since 



r- P 3 1^1“ V 

lim E exp< > 

' n k=o 



we have 



q a-\ {b-a) 



= limE(x(Av“)|2I)=lim Ex(K), 



i/e( 1)= E exp 



r 

b-a } 



laOr dt\. 



Clearly, 



^{o+)=p< \atrdt<cx,\. 



The proof of the theorem follows from the last relation. □ 
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§5. Measures in Hilbert Spaces 

The space ^2 is the most interesting among the spaces discussed in 
the previous sections. It is a separable Hilbert space. Since all the sep- 
arable Hilbert spaces are isometric, it is more convenient to consider the 
abstract separable Hilbert space The results obtained for such a space 
can easily be restated for various specific Hilbert spaces, for example, 
for the space of measurable functions on an arbitrary measurable space 
with the measure taking on values in a separable Banach space and square 
integrable in the norm. 

Denote by © the a-algebra of Borel sets . The pair (^, ©) is called 
a measurable Hilbert space. Measures ^ defined on a measurable Hilbert 
space (^, ©) are the main subject of study in this section. As before we 
are interested in probability measures, but since the results of this section 
are applicable for any finite measures, condition is not imposed. 

The scalar product in ^ will be denoted by (x, y) and we denote the 

norm of x by |x| = ^(x, x). A measure on S) - as in any linear space- - 
can be defined by means of a characteristic functional. Every continuous 
linear functional /(x) defined on ^ is of the form /(x) = (x, z) where z is 
an arbitrary element in A function (p{z) defined by equality 

(p{z)= fi(dx) (1) 

for all is called the characteristic functional of measure fi on (^, ©). 

Let be an arbitrary finite dimensional subspace of the space 
and be the <7-algebra of Borel subsets of if. A set of the form 

^ {x'.P^xeA^}, 

where and is the projector into if is called a cylindrical set 

with base in if. The totality ©”^ of all cylindrical sets with bases in if is 
also a (7-algebra. The sets belonging to ©*^ for some if are called cylin- 
drical sets and functions measurable relative to ©-^ for some if are 
called cylindrical functions. 

Given a measure fi on (^, ©) one can associate with it a set of its finite 
dimensional projections (finite-dimensional distributions) defined 
by the equality 

H^{A 

Measures are sufficient for evaluating integrals of cylindrical func- 
tions: For each ©^-measurable bounded function h{x) 



h{x) fi^(dx)= h{P^x) ju(rfx). 



( 2 ) 
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We note that any cylindrical function is of the form h{P^x) for some if 
where h is ©^-measurable. Measures for various ^ are coordinated 
in the following manner: if if c: if', then 

This relation follows from (2) and the feasibility of representing function 
h{P^x) in the form h'{P^rx) where h' is ©j^.-measurable. Condition (3) 
will be referred to in what follows as the condition of compatability and 
the family of measures {p^] defined on all the finite-dimensional sub- 
spaces if and satisfying condition (3) is called a compatible family of 
finite-dimensional distributions. 

To define measure p it is sufficient to know only its one-dimensional 
projections. This follows from formula 

<p(z)=| 

where if^ is a one-dimensional subspace generated by vector z. Con- 
versely, given characteristic functional q> (z) one can easily determine all 
the measures u ^ by means of their characteristic functionals as follows : 
for Z6^ 






n^{dx). 






Moreover, it turns out that the existence of a function (p{z) satisfying the 
equality 

<^(z) = J ZGif, 

for any finite-dimensional subspace if is a necessary and sufficient 
condition for compatability of the family of finite dimensional distribu- 
tions [pA- prove this assertion. Let the family [pA satisfy (3). 

Set 



(pA^) — j zeif. 

If if c if' and zeif, then in view of (3) 
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Set cp{z) = (p^^{z), where is a one-dimensional subspace generated by 
the vector z. Since for zgJ^ and we have (p^{z) = (p^^{z) = (p{z). 

Conversely, if q>(z) is a function such that (p{z) = cp^{z) for zeJ^, then for 
and zgJ^ we have the relation 

I* liAd^)= (* H^\dx) = 



*. 



,‘(z, P^x) 



li^\dx) = 



dx). 



which implies that the measures jj,^{dx) and dx) coincide (since 

their characteristic functionals coincide). Thus condition (3) is fulfilled. 

Moment forms. Important characteristics of measure on 95) are 
the moment forms of this measure. A moment form of order k of measure 
/I is determined by the relation 

m^(zi,...,Zfc)= (* {x,zi)...{x,z^)n(dx) 



under the condition that the integral on the right is well defined (and 
finite) for all choices of z^, . . ., Z;^g J". Clearly for the existence of a moment 
form of order k it is necessary and sufficient that for all z the relation 

f |(x, z)\^ fi{dx)< CO (4) 



be satisfied. 

The function mfc(zi,..., Zj) is a symmetric function in its arguments 
and moreover is continuous and homogeneous in each one of them. We 
show that the moment form of order k (under the condition that it is well 
defined) is a continuous symmetric /c-linear form. For this purpose it is 
sufficient to show that 



sup l(z, x|'"/x(dx)<oo 
1^1 



( 5 ) 



is valid. 

We introduce the functions 



n\ix, z^ 



- ti{dx), m(z)=\ |(x, zf /i(dx). 



m„{z) = 
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Functions m„(z) are weakly continuous in z and m^(z)'\ m{z) for all z as 
n-^co. We set 



Kn.i={^-m„(z)>l}n{z:\z\^l}. 

The set K„ i is weakly closed and weakly compact (since it is bounded). 
To prove (5) it is sufficient to show that K^ = D is void for some / 

n 

(then sup m(z)^/). The sets are also weakly closed and weakly com- 

|z|^l 

pact. If all are non-empty, then the intersection H will also be non- 

i 

empty. But m„(z)^oo for zgH which is impossible. Our assertion is 

i 

thus proved. □ 

The first two moment forms are the most often used. The form (z) 
is a continuous linear functional with respect to z provided it is well 
defined. Hence there exists a vector ae^ such that 



J (x, z) fi{dx) = mi (z) = {a, z ) . 

This vector is called the mean value of the measure fi. If the form m 2 (zj, Z 2 ) 
is defined (in this case the form (z) will also be defined), then the ex- 
pression 

m2{zi,Z2)-mi(zi)mi{z2) 

will be a continuous symmetrical bilinear functional. Consequently a 
symmetrical bounded linear operator B exists such that 

ni2(zi, Z2)-mi(zi) mi(z2) = (Bzi, Zj). 

This operator is called the correlation operator of the measure p. It fol- 
lows from relation 



0 ^ J {x — a, zY p (dx) = J (x, z)^ p (dx) — (a, z)^ == 

= m 2 (z, z) - (Wi {z)f = (Bz, z) 
that B is a non-negative operator. 

We note one important property of the correlation operator. Recall 
that a symmetric non-negative operator B is called a kernel (or nuclear) 
operator if it is completely continuous and the series ^ of its eigen- 
values converges (each value appears in the sum as many times as is its 
multiplicity). A symmetric non-negative operator B is a kernel operator 
if in some orthonormal basis of the space ^ the series ^(Bc^, e^) is 
convergent. In this case this series will be convergent for any choice of 
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the basis and its sum will be independent of the choice of the basis. This 
sum is called the trace (spur) of the operator and is denoted by SpB. 

Lemma. The correlation operator B of a measure p is a kernel operator if 
and only if the following condition is satisfied 

J |xp p{dx)<co. 

Moreover 

SpB = j* |xp ix{dx)-\a\^, 

where a is the mean value of p. 

The proof follows from the equality 

k=l J k=l k=l 

which is valid for any choice of e^,..., e„. Taking the vectors from an 
orthonormal basis and approaching the limit as n^co (this limiting ap- 
proach under the sign of the integral is justified in view of the monotonic- 

n 

ity of the sequence ^ (x, in n) we obtain the assertion of the 

k= 1 

lemma. □ 

The Minlos-Sazonov theorem. As was mentioned above, a measure p on 
a measurable Hilbert space S) can be defined by either its finite 
dimensional projections or its characteristic functional. As it turns out 
the two methods do not differ significantly. 

Let a compatible family of finite dimensional distributions {p^} be 
given. Under what conditions does a measure p exist on (3C , 93) such that 
{p^} are its projections. Since p^ enable us to construct a functional 
(p{z) which coincides for zeif with the characteristic functional of 
the measure p^, the problem posed reduces to the following: under what 
conditions is cp (z) a characteristic functional of a certain measure p on 
(^, 93)? The answer to this last question is given by Minlos-Sazonov’s 
theorem. 

Theorem 1. In order that a complex-valued continuous positive definite 
function (p{z) defined for ze 9I be a characteristic functional of a certain 
measure p on (^, S) it is necessary and sufficient that for any £ > 0 <3 kernel 
operator be found such that Re {(p (0) — <p (z)) <s as long as (A^z, z) ^ 1 . 
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Proof. Necessity. Let (p(z) be a characteristic operator of the measure /i. 
Then 



Re((jff(0)-(ii)(z)) = J (1 

— cos(z, x)) i^{dx)^ 

' f 

- fi{dx) + 2 






jii{dx)^ 



j 

IjcI^c 



The expression 



{x, zf fi{dx)-\-2in{{x:\x\>c}). 

\xUc 

(x, zf fi{dx) is for each c a quadratic functional rel- 



Ixl^c 

ative to z and admits representation in the form {B^z, z) where is a 
kernel operator (in view of the lemma above) since 

I |xp fi{dx)^fi{^) c^. 

\ x\^C 

We choose c in such a manner that the inequality /t({x:|x|>c})<8/4 is 
satisfied. Then taking = - B^,wq obtain for (T^z, z):^ 1 

Re((p(0)-<?)(z))<^ + i (B,z, z)=|+| 

The necessity of the condition of the theorem is thus proved. 

Sufficiency. Let {fi^} be a compatible family of finite dimensional 
distributions constructed using q>{z). It follows from the theorem in 
Section 3 and the remark in this theorem that it is sufficient to prove the 
existence of N such that for any a > 0 and all the finite-dimensional sub- 
spaces the inequality 

H^{{x:\x\>N})<s (6) 

is fulfilled. Indeed, in place of functions h„{x) appearing in the remark in 
Section 3 we can use functions |P^„x|, where is an increasing sequence 
of finite dimensional subspaces for which is dense in and P is 
the projector into 

To prove formula (6) we utilize Chebyshev’s inequality, which yields 

jU_^({x: |x| > N})^{1 — ^ pi^{dx) = 

se 
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:Af- 



iix,z)\ -(l/2A)|zP 






m^{dz) ji(dx). 



where m^(dz) is the Lebesgue measure on is the dimension of . 

Interchanging the order of integration we obtain 

^(27c 1) 2~f (<p(0) — (jo(z)) m^(dz). 






Next we choose a kernel operator A such that Re((p(0) — (p(z))<- for 
{Az, z)<l. Then 



^- + (27ci) 2 X 



I 






(Az, z) > 1 



<^+2(27r2)-^|(/lz,z)e-<^/2A)|zP « 



m^(dz)^-A2A Sp^4, 



since 



(27zA)~ 



(Az, z) e 



-d/2A)|z|2 



m^(dz) = 



= Z (2nX) 2 { {Aei,ej)(z,ei){z,ej)^e m^(rfz) = 

i,j=l J 1 



= Z (M, e,) 



^2nX 



e = X ^ (z4ef, Spz4. 



Thus 



li^{x:\x\>N}^\^ + 22SpAyi-e \ (7) 

Clearly, one can choose X and N in such a manner that the right-hand- 
side of (7) will be less than s. The theorem is thus proved. □ 

Generalized measures on a Hilbert space. In Section 3 a procedure was 
described which enables us to construct, for each positive definite func- 
tion defined on (the conjugate of the space an extension §t of 
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the space ^ and the measure /i in # such that (p(x*) becomes the charac- 
teristic functional for this measure. Let (p(z) be a positive definite function 
defined on the Hilbert space 3C. Utilizing the above-mentioned result, 
one can construct an extension of the space and a measure on that 
extension such that cp(z) will be the characteristic functional for this 
measure. However, the procedure presented in Section 3 results in a space 
# which is too wide. In the case where ^ is a Hilbert space, ^ can also 
be constructed as a Hilbert space obtained by completion of ^ in a 
certain scalar product depending on the continuity conditions of (p{z). 
We shall now consider this construction due to Yu. L. Daletskii. 

Let .5 be a bounded symmetric positive linear operator. Introduce in 
^ a new scalar product 

(x, j) _ = {Bx, y), I xl i = (fix, x) . (8) 

The space ^ is in general incomplete in the metric generated by this 
scalar product. Denote the completion of ^ in the norm |-|_ by 
(this set can be regarded as an extension of ^), is an everywhere dense 
set in and dC coincide is a bounded operator. Denote by 

the Hilbert space obtained from the domain of definition of the 
operator (which is dense in by introducing the scalar product 

(x, y)^={B- B - ^i^y) = {B~^x,y). (9) 

The second equality in formula (9) requires certain clarification. Note 
that for any the scalar product (x, z) defined for z on ^ can be 

extended by continuity in the metric of to the whole space In- 
deed, let x = B^^^Xq, XqE^. Then 

|(x, z„-zJ| = |(fii/%, z„-zJ| = |(xo, fi'/^(z„-zj)|< 

<|Xol (fii^^(z„-zj, fii/^(z„-z„))i/2 = |xo| \z„-zj_ . 

Therefore the linear functional (x, z) on ^ is continuous in the metric of 
and hence it can be extended by continuity (in z) to the space . In 
what follows, the expression (x, z) where xg^+ and ze^^ is understood 
to be this extension. The operator B can also be extended by continuity 
to since 

\Bx\_ = ^{Bx, Bx ) I = yJ{B^x, Bx) = 

= ^{B^B^f^x, B^^^x)^^\\B^\\ (B^/^x, ||J5|| |x|_. 

In what follows B will be regarded as extended to Moreover the fol- 
lowing relations are satisfied 

nl/lorB _Of 'D\i2(jf_ OfB ROfB _ (jfB 

JLJ %Aj iAj ^ U tA/ tAj ^ UtAj ^ tA/ _|_ • 
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The third relation is a consequence of the first two. The second follows 
from the definition of ^ + . We now prove the first assertion. Let z be an 
arbitrary element in z„e^ and |z„ — z|_-^0. This means that 



= n,m^co. 



However therefore also. We now return to equality 

(9). From the above, operator B~^x is (well) defined for xg^+ and be- 
longs to ; since the scalar product {B~^x, y) is also well defined. 

Now let a certain measure be defined on Then in terms of 



this measure one can construct the characteristic functional (p^{z) = 
Q^(x,z )- defined for zg Since (x, z)_ ={Bz, x) and Bze^ + , 
the measure ji can also be defined by means of the characteristic func- 



tional (p{z) = 



^i{z,x) jj^^dx) where zg^+. Note that 



(p^{z) = (p(Bz), (p{z) = <p_(B ^z). 



It follows from Theorem 1 that (p{z) is the characteristic functional of a 
measure on if and only if a kernel operator S exists on such that 
for each e>0, Re{(p^{0)-(p^{z))^s, provided (5z, z)_^l. We utilize 
this result to construct an extension of in order that the given positive- 
definite functional (p{z) be a characteristic functional on this extension. 

Theorem 2. Let (p{z) be a continuous positive definite functional defined on 
^ . Then for any kernel operator B, (p{z) is a characteristic function of a 
certain measure on 



Proof It follows from the continuity of cp (z) that for any £ > 0 a 5 > 0 can 
be found such that RQ((p{0) — (p{z))^s provided (z, z)<5. The Re{(p{0) — 
-(p(Bz))^s for ZG^? provided {Bz, Bz)^5, i.e. Ro{(p_{0) — (p_{z))^£ if 

Bz, Z^ 1. 

We show that the operator \ B defined on is a kernel operator. 

o 

It is sufficient to show that B is such an operator. But 



1) (Bx, y)_ ={B^x, y) = {Bx, By) = {x, By)_ ; 

2) (Bx, x)_ =(Bx, Bx)^0. 

Finally we show that 

00 

3) Sp_B=^ (5e^, efc)_<GO, 

k=l 
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where {e^} is an orthonormal basis in Indeed, set 6^= fjy/lk where 

{/fc} is the basis consisting of the eigenvectors of the operator B in 
K = (A,A)=l.Then 



00 GO (r>2r r\ 00 

Sp-(B)= I {B\,e,)= I EA, = SpB. 

k=l k=l /fc k=l 

The theorem is thus proved. □ 

Remark L Let a positive definite function (p{z) satisfy the condition: 
for any 8>0 there exists ^>0 such that RQ{(p(0) — (p{z))<£, provided 
only (Vz,z)<S where F is a bounded symmetric positive operator. 
Consider the space where -S' is a certain symmetric positive operator 
commuting with F. We now obtain the conditions that must be imposed 
on S in order that a measure exist in with characteristic functional 
(p(z). Since 

Re((p-(0)-<p^ (z)) = Re (cp (0) - <p (&)) 

for (VSz, Sz) = {VS, z)_<S and Sp_ FS'^Sp VS, such a measure exists, 
provided Sp FiS< oo. This assertion is valid also in the case when (p(z) is 
defined on a linear manifold which is dense in dC and F is an unbounded 
operator. 

Measures defined on for some S, whose characteristic func- 
tionals in a scalar product in dC are defined on an everywhere dense set 
in ^ are called generalized measures on ^ . Theorem 2 shows that a 
generalized measure is not uniquely constructed by means of its charac- 
teristic functional defined on , Let 3T and 3T be two extensions of the 
space in which the measures fi' and fi" are defined which correspond 
to the same characteristic functional (p (z). Then one can find an extension 
which is included in each one of the extensions 3T and 3T such that 
/i' = 0 and — ^") = 0 and \jl coincides with \jl' on SC'" . This 

extension is easily constructed as follows : if 

T = = then = 

Therefore a generalized measure is, in a certain sense, uniquely con- 
structed. 

It is more convenient to formulate the conditions of Theorem 1 in 
the spaces . 

Remark 2. In order that the conditions of Theorem 1 be satisfied it is 
necessary and sufficient that a kernel operator B exist, such that the func- 
tional (p{z) be continuous in the metric of SC . and hence, extendable on 
. Indeed if (p{z) is continuous in metric then for any ^>0 a ^>0 
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can be found such that RQ{(p(0) — (p{z))^£, provided that i.e. 

/I \ 1 

I - Bz, z 1, where -B is a. kernel operator. 

Conversely we show that the existence of operator B follows from the 
condition of Theorem 1. We choose a sequence and let the oper- 
ators satisfy the condition of Theorem 1 for a = e„. Let be a sequence 
(c„ 1 0) such that 

00 

E c„Sp^„<oo. 

n= 1 



Then the operator B= Y, is the required kernel operator. Indeed, 

n= 1 

for any e > 0 an n can be found such that s„ < s. Then for (Bz, z) < c„ we 
have: (T„z, z)< 1 and hence 

RQ{(p{0)-(p{z))^e„<8. 

Next 



|<p(Zi)-(j9(z2)k 









< 



J(Zi-Z2,X) _ 1 |2 



1/2 



fi{dx)j = ^2 Re {(p (0) -(p{z^~ Z 2 )) . 



It follows from this inequality that \(p{zi) — (p{z2)\^0 as 

(B(Zi-Z2), Zi-Z2)->0. 

From the last remark we deduce, in particular, that the function 
which is a continuous and positive definite function cannot be a charac- 
teristic functional of a measure on ^ since it cannot be extended contin- 
uously onto with the kernel operator B. 



§6. Gaussian Measures in a Hilbert Space 

Let ^ be a probability measure on a measurable Hilbert space (^, ®). 
Then (^, ®, /i) is a probability space and any ©-measurable function 
g{x) is a random variable on this space. The measure g, is called Gaussian 
if every continuous linear functional 4 W = {z, x) is a normally distributed 
random variable. Let 

E(z, x)= f (z, x) i^{dx). 
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P^=E (z, xY — = J (z, xY fi{dx) — 

Since the distribution of (z, x) is normal these variables are defined for 
each z and hence, as it was established in Section 5 in the course of the 
study of polylinear moment forms, a vector a and a bounded symmetric 
non-negative linear operator B exist such that 

a^ = (a, z), p^ = {Bz,z). 

Since (z, x) is normally distributed, we have 



Ee^ = 



fi{dx) = Qxp{i{a, z)—j{Bz, z)}. 



Thus any Gaussian measure possesses the mean value a and the corre- 
lation operator B, and moreover the characteristic functional of this mea- 
sure is of the form 



(p{z) = exp{i(a,z)-^Bz,z)}. (1) 

Conversely, if the characteristic functional of the measure ^ is of the 
form (1), then 

eit (z, X) ^ ^ ^ (^a, ~ y ^)| 

and hence the variable (z, x) is normally distributed with the mean (a, z) 
and the variance (Bz, z). Hence the necessary and sufficient condition for 
a measure /i to be Gaussian is that the characteristic functional of this 
measure admit representation (1). 

It follows from formula (1) that for any finite set of vectors z^,..., z„ 
the joint distribution of the variables (z^, x), ..., (z„, x) is also Gaussian. 
Indeed, 

E exp{i X h(zk, x)} = (p(Y, hz^) = 

= exp {i X h(a, Z/i) -i Z Zj)} • 

How arbitrary is the choice of quantities a and B in formula (1)? If B is 
a positive definite operator, then the function q>(z) defined by (1) is pos- 
itive definite. Other restrictions on a and B are imposed by Minlos- 
Sazonov’s theorem. Let ^ be a kernel operator for a given a > 0, such that 

Re(l — (p(z))<£ for (Tz, z)<l. 

Then for (Tz, z) < 1 the following inequalities are satisfied 

HBz, z) < exp z)} - 1 < 
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<[l-exp{-i(Bz, z)}] [l-(l-exp{-i(Bz, z)})] 
< [1 —exp { z)} cos(a, z)] x 

X [1 -(1 -exp{ -i(Bz, z)} cos(a, z))]" ^ 

(if e< 1, then cos (a, z)>0). Therefore 

{Bz, z)< (Az, z) 



and 



2a 

SpB<- Spy4. 

1—8 

Consequently, condition Sp5<oo is a necessary condition for formula 
(1) to determine a characteristic functional of a measure on (^, ®). We 
now show that this condition is also sufficient. Since 

|l-<)!)(z)|<i(Bz, z) + |(a,z)|, 

1 4 

it follows that \l—(p{z)\<e, provided only - {Bz, + ^ («? z)^ < 1. Setting 
1 4 

A, = -B + -,P^, where P^z = {a,z)a, 

8 8 

we observe that the conditions of Minlos-Sazonov’s theorem are satis- 
fied, since 



1 4 

SpAg = - SpBH — 2 |^|^<oo. 

Therefore the following result is obtained : 

Theorem 1. A measure jx is a Gaussian measure on (^, ®) if and only if its 
characteristic functional (p{z) admits representation (1) where a is an 
arbitrary vector in % and B is a kernel operator. Moreover, a is the mean 
value of the measure p and B is its correlation operator. 

As it follows from the lemma in Section 5, for any Gaussian measure 
J \x\^ p{dx)<co. 

Let • • • be a orthonormal basis of the eigenvectors of B, whose 
existence follows from the fact that B is completely continuous. If 2^ is 
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an eigenvalue of B corresponding to then 

(BCi, 

Therefore the random variables (x, 1, n possess the joint char- 

acteristic function 



E X k{a, 

The last formula shows that the variables (x, fc = 1, . . n, are jointly in- 
dependent. If then the variable ^ —^ = ik is normally distrib- 

uted with the mean 0 and variance 1. Regarding x as a random element 
on the probability space (^, ®, jj) one can write 



: = a + X y/K 



(2) 



where are independent identically distributed Gaussian random vari- 
ables on (^, ©, jn) defined for all k such that >^^>0, £(^^ = 0, Var^fc= 1. 
The representation (2) can be utilized for various calculations. We con- 
sider an example of an application of formula (2). We shall evaluate the 
Laplace transform of the variable |x|^. Since 



= where at = (a, e^), 

it follows that 



uu 

n Eexp{sX^il + 2s = 



00 r 1 _ 

,gS\a\^ ^ + sAfct^ + 2s VAk _ 



exp 



22^als^ 



*==1 Vl-2sA, 



The last infinite product converges for Res< ; this follows from the 

^ 2\\Bf 

convergence of the series ^ 2^ = SpR. The infinite product obtained can 
be expressed by means of operator 






( 3 ) 



which is easily expressed in terms of the resolvent of operator B. 
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Indeed 

00 1 r 00 

n , =exp<^ -i t ln(l-2s4) 

‘= 171 - 2 sA, <- 

0 0 

00 9; ^2 

Z -^ = 2(BR,(B)a,a). 

k=l 1 — 

Thus for all 5 < the following formula is valid : 




r 



ju(dx) = exp 



2s^(BR,(B) a, a)+ 



s 

I 



Sp BRt{B) dt-\-s\a\^ 



(4) 



This formula can also be used for the determination of the Laplace 
transform of the variable (Vx,x) on the probability space (J", ®, //) 
provided only that K is a non-negative symmetric operator. Let V = 
where U is also a non-negative operator. In this case {Vx, x) = \Ux\^ 
where (7x is a random element, with values in ^ on the probability space 
(^, ®, jj). The characteristic functional of the variable Lx is of the form 



•> 



^i{Ux, z) 



]ii{dx) = 



fi{dx) = Qxp{i(Ua, z)-^UBUz, z)}. 



Consequently, Ux has a Gaussian distribution with the mean Ua and 
correlation operator UBU. Hence we have, in view of formula (4); 



I 



fi{dx) = Qxp\2s^{UBUR,{UBU) Ua, Ua)-\- 

s 

+ { SpUBUR,{UBU)dt + s\Ua\^\. (5) 



Linear and quadratic functionals. Let // be a Gaussian measure on (^, ®). 
Any measurable function g{x) admitting representation as the limit in 
measure /i of a sequence of continuous linear functionals 

gf(x)= lim {x, z„) 

00 

is called a measurable linear functional with respect to measure g. Since 
the variables (x, z„), « = 1 , 2, ..., have a joint Gaussian distribution it 
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follows from the convergence of (x, z„) in measure /i to g(x) that ^(x) will 
also have a normal distribution and also that {x, z„) converges to g{x) 
in the mean square. Hence 

lim l(x,z„)-{x,zjf ^l{dx)=: lim [(a,z„-z„f + 

n,m-*co J n,m-*oo 



SqIA = B + where PaZ = (a, z) a. A is then a kernel operator. We intro- 
duce scalar product {x, y)_ =(Ax, 3 ;). 

Let be the completion of ^ in this scalar product. If 

/* 

[(z„, x) - (z„, x)y (dx) ->• 0 , (z„ - z„, z„ - zj_ ^ 0 , 

i.e. the sequence z„ is fundamental (Cauchy) in ^1. It is natural to asso- 
ciate with function g (x), which is the limit (in measure fi) of the sequence 
(x, z„), the element in which is the limit of z„. If limz„ = z*, we denote 
g{x) = {x, z*). It is easy to see that this correspondence between the 
measurable linear functionals and is one-to-one. In what follows we 
shall identify the space of linear functionals with ^1. The space of 
measurable linear functionals is a Hilbert space with scalar product 
(x, y)_. For any set of zf, ..., z* belonging to .^1, the functionals (x, zf), 
..., (x, z*) have a joint normal distribution and moreover 

E exp {i X x)} = exp] i ^ h{z*, a)-i X hhiM, z*) 

I k kj 

where (z*, a) is defined as the limit lim (z„, a), z„e^, z„->z* in and 

n-*co 

{Bz^, zf) is defined as the limit lim (Bz^, z"), z^-^z^, Zj-^zf in The 

n^oo 

existence of both limits follows from the inequalities 

(z„ - Zm, af < (z„ - z„, z„ - z„)_ , 

(5(z„-zJ, z„-z„)^(z„-z„, Z„-zJ_ . 

Since every element z*g^ 1 can be represented as B~^^^z, ze3C, we use 
in place of (z*, x) the notation (z, B~ ^^^x) where ze3C. 

We now derive the representation of measurable linear functionals 
which uses decomposition (2). For any z 

(z, x) = (z, a) + X ^/^K, z), 

where is a sequence of independent identically distributed random 
variables with = 0 and Var(^fc=l. If (z„, x) ^ (z*, x) in the mean 




354 



Chapter V. Probability Measures on Functional Spaces 



square, then (z„, a) (z*, a), and the limit of (z„, ej,) will exist for those k 
which satisfy Xj,>0. We denote this limit by (z*, ej,). Then 

(z*,x) = (z*, + V4(z*,efc). (6) 

Conversely, formula (6) defines a measurable linear functional for any 
sequence of numbers (z*, ej,) such that the series 

Z K and Z {a, e^) = (z*, a) , 



are convergent. 

We shall now study measurable quadratic functionals for measures fx 
with mean zero. We shall distinguish between measurable quadratic 
functionals and measurable quadratic centered functionals. 

A random variable g{x) on a probability space (^,®, fi) is called a 
measurable quadratic functional if a sequence of symmetric linear 
bounded operators A„ exists such that 

g(x)= lim {A^x, x) 



in measure g. A random variable g{x) is called a measurable centered 
quadratic functional if a sequence of symmetric linear bounded operators 
An and constants exist such that 



^(x)= lim [(^x, x) + cj 

n~*‘ 00 

in measure g. 

Assume that operator B is nondegenerate (otherwise one can consider 
a measure on the closure of the range of the values of operator B). Next 
let = Ck) where Cj, are eigenvalues of operator B. Utilizing de- 
composition (2) one can write 

(A„x, x) = Z 

i,k 



Since = ^k) one can formally represent {A„x, x) in 



the form 



{AnX, x) = {B^I^AnB^l^y,y), 



where y = X ik^k is a certain generalized random element in i.e. a 
random element whose distribution is a generalized measure in ^ (cf. 
Section 5). We note that for any zg the scalar product 

is defined, since are independent with £(^^ = 0 and 



V(2, ej) ^fc = (z, ejf, 



1 



k=l 



V(z,efc)4 = |zp. 
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Hence if v is a generalized measure which is the distribution of the element 
y, then its characteristic functional (p^{z) is equal to 

Let /i,/ 2 , ... be an orthonormal basis of eigenvectors of the operator 
(it is completely continuous). Then one can set 

y=Y.nJk, 

where ^fc = (y,/fc)=X(^j’/fc) ^ sequence of random variables on the 

j 

probability space (^, S, /i). It follows from the relation 

e-iW" = E exp{i^ t/*(z,/fc)}=exp{-iX(^./fcn 

that are also independent Gaussian random variables with Erjj^ — 0 
and V^^=l. Let cl be eigenvalues of the operator B^'^A„B^'^ corre- 
sponding to /fc. Then 

{A„x,x)=Y,4t]l. 



Lemma. Let for each n a sequence r]„j^ of independent Gaussian variables be 
given with Erj„}^ = 0 and = 1. If constants d„ exist such that for n-^co 
Z in probability then 

and 

k k 

Proof. First note that under the assumptions of the lemma sup|cj[| 0, 

k 

since for each k 



T,CJt]l + d„ 



<£^<supP{|c2>/4 + ^|^e} = 



, , s I 2 






and hence for each a > 0 



sup|c2|^-8 P 

k 71 



ZCjf!nj + d„ 






Now if the inequality J^(ciy>S were satisfied for some sequence of 

k 

indices n then the variable Y, ^^Ik would, in view of the central limit 

k 

theorem be asymptotically normal and therefore for each a>0 we would 
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have 

l = lim P 

«-> CO 

which is impossible. Finally utilizing relation 

j J j 

and the fact that 

j j 

we observe that ^ c” + ^ 0. The lemma is thus proved. □ 

j 

It follows from the lemma that convergence of {A„x, x) + d„ in measure 
to a certain limit implies the convergence of {A„x, x) + d„ to the same limit 
in the mean square. It is easy to verify the calculations : 

[ x) + d„y (i{dx) = 2 ^ (c;)2+(X c"j+d„f = 

J j j 

= 2 + (Sp{B^>^A„B^<^) + d„f = 

= 2SpiA„By + {SpA„B + d„f. (7) 

Therefore the following assertion is valid: 

Corollary. In order that the limit in measure fx of the expressions 
(A„x, x)-i-d„ exist, (for a certain choice of dj it is necessary and sufficient 
that the equality 

lim Sp([^„-^J 5)2 = 0 

n, m-* 00 

be satisfied. Moreover we can choose d„ equal to — Sp^„^. If the limit 
lim SpA^B exists then d„ can be chosen equal to zero. 

n-*^ 00 

We use these arguments to obtain a general form of the quadratic 
functional. Since 

{A„X, x) = ^ y/Xk (Xlj y/Tj j — dkj) + Yj ^k^kk ? 

kj k 

Sp {[A„ - ^ J 5)2 = ^ (^4 (4j C^J ^)2 , 

k,J 

it follows that if the limit lim {A„x, x) + in measure exists the limits 

n-^co 

lim alj .Jlj = Y. (V^ Vh - ■ 

n-*ao k,j 



Y^k^lnk + dn 

k 



lim supP 



k 



nk + ^1 






28 



yjlnd 
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also exist. Moreover, 






Functionals which are the limits of expressions 
lim x) — Sp A^B'] 



will henceforth be termed centered quadratic functionals. The general 
form of a centered functional is given by formula 



kj 



(x, e^) (x, ej} 






( 8 ) 



where are arbitrary numbers satisfying ^ Plj < oo. If an arbitrary con- 

k,j 

stant is added to a centered functional we obtain a general form of a 
measurable quadratic functional. A centered functional is obtained from 
an arbitrary one by subtracting the mathematical expectation. 



Linear and quadratic functionals of stationary Gaussian processes. Let 

(^(/) be a real stationary Gaussian process with mean 0 and correlation 
function R{t) and spectral function F(A): 




dF{X). 



Next let y{X) be a complex-valued Gaussian process with orthogonal in- 
crements such that 



<^( 0 = 



dy{X). 



We consider ^{t) on the interval [— T, T]. A probability measure on 
Hilbert space ^2 T] of real square-integrable functions on [ — T T] 
is associated with this process. We now utilize the preceding results for 
obtaining linear and quadratic functionals on the process ^{t). 

An arbitrary random variable rj admitting representation as the mean 
square limit of the variables 

T 



^n=\ 



-T 

where x„(?) is a sequence of continuous functions defined on [— T T] 
is called a linear functional on the process ^(r). We now obtain the gen- 
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eral form of a linear functional on ^{t). Since 



T 



-T 






dy(A) x„(t) dt 






x„{t) dt 



dy{X), 



it follows that 

E»?n=j' \9nW 

where 



T 



(PnW= e‘^‘x„(t)dt. 



( 9 ) 



Denote by the Hilbert space of functions containing all functions 

of the form (9) and completed in the scalar product 



{(Pu(p2) = j (PiW(p2{^)dF(X). 

It then follows from the convergence of rj„ to a certain limit that the 
functions (p„{^) defined by equation (9) converge to a certain limit cp be- 
longing to and moreover 

ri={ (p{X}dy(X). (10) 



Clearly formula (10) represents for (peiFriF) the general form of a func- 
tional on a process ^{t) defined on [—7^ T]. 

An arbitrary random variable C which is the mean-square limit of 
the variables 

T T 



Cn= s) l^(t) ^{s)-R{t-s)'] dt ds. 



-T -T 

where g„ {t, s) is a sequence of continuous real symmetric functions defined 
for all t and se[—T, T] is called a centered quadratic functional on ^{t). 
It can be easily calculated that 



E|C 



T T T T 

J^=E J J j g„(t,s)g„(u,v)^(t)^(s)^(u)i{v)dtdsdudv = 



-T -T -T -T 
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where 



T T T T 

g„{t, s) g„{u, v) R{t — s) R{u — v) dt ds du dv = 

-T ~T -T -T 





\(P„{X, fi)\^ dF{X) dF{^l), 






1 1 

-T -T 



gn{t, s) dt ds. 



(11) 



Denote by W't{F) the Hilbert space of functions containing all functions 
of form (11) and completed in the scalar product 




<Pi /^) dF{X) dF(fi). 



Then, if the variables Cn converge to a certain limit, cp„ converge to a 
certain function (p belonging to iF'j{F). To express C in terms of (p we 
introduce the double stochastic integral 



J J (p{^, in) dy{X)dy{ii). (12) 

We define this integral as an integral over a random measure with or- 
thogonal values (cf. Chapter IV, Section 4). Let the measure v on be 
defined on the rectangles by relation 

vpi, /I2] X [/ii, ;i2]) = 

= y{iXi, A2]) ^2] n [mi, Fz]), (13) 

where 



y([Ai, hli)=y{^2)-yi^i), X2 ])=f(X2)-f(i,). 

The measure v is a measure with orthogonal values for which 

E |v([Ai, X2] X [fii, F2W=F{lXi, X2]) F(lni, ^2])■ 

Therefore the integral 



n 



(p(x, fi) dy(X) dy(fi) 






ju) v(dA X dfi) 
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is defined for all q> such that 

\(p{k, dF{X) dF{ij)< CO 






and in particular for (peiF'j{F). We show that the integral (12) with 
(peW'j{F) represents the general form of a centered quadratic functional. 
For this purpose it is sufficient to verify that 



T T 



J dn{t,s) l^(t) ^{s)-R(t-sy\ dt ds = 

-T -T 

= j* q>„{X, n) dy{X) dy{fi), (14) 
where (p^ is connected with by formula (11). Let 

T T 

= j* and Jc{l)= ^ k{t) dt. 



-T 



Then 



i i 

n 

-T -T 



g{t) k{s) ^{s) — R{t — s)~] dt ds = 



g (A) dy{l) J ic{n) dy(n) - g (A) ^(A) df (A) = 






g{X) Ji{g) dy{X) dy(g) 



(the last equality follows from formula (13) for step functions, therefore 
it is valid for all continuous functions). Hence (14) is also valid for the 
linear combinations of the form 

Z dj{t) kj(s), 

J 

and hence for all g„(t, s). □ 

As a corollary from the above we obtain the following formula: if 
(p{x, /i)=X c^cPki^) then 

k 



<^(2, fi) dy{X) dy{fi) = 
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I cp,(l)dy{X)"-^ 



\cp,(xr dF{X) 



This formula will be utilized in succeeding chapters. 



(15) 




Chapter VI 



Limit Theorems for Random Processes 



§ 1. Weak Convergences of Measures in Metric Spaces 

Let ^ be a metric space with metric ^(x, y), ® be the cr-algebra of its 
Borel subsets, be the space of all bounded continuous functions 
defined on ^ with the norm ||/||^ = sup|/(x)|. A sequence of measures 

defined on ® is called weakly convergent to measure g if for any 
function / in the relation 

i™ j* l^r.{dx)=j /W t^{dx) 

is satisfied. The set M of measures {/j} defined on ® is called weakly 
compact if from any sequence of measures in M a weakly convergent 
subsequence can be extracted. 

Theorem 1. Let ^ be a complete separable space. In order that the set M 
of measures defined on ® be weakly compact it is necessary and sufficient 
that the following conditions be satisfied: 

a) sup{/i(^); /ieM}<oo; 

b) for every 8 > 0 there exists a compactum K such that 

sup{g{d^\K); jLteM}<s. 

Proof. Necessity. Since the set M is compact it follows that the set of 
numbers r ^ 

I f{x) fi{dx);neM 

is also compact for any continuous bounded function / and hence this 
set is bounded. Choosing /= 1 we obtain the necessity of condition a). 
We now prove the necessity of condition b). Denote by the set of x 
such that q(x, K)<d, where q{x, K)= inf ^(x, y). We show that for any 

yeK 

8>0 and ^>0 there exists a compactum K such that g{df\K^)^e for 
all fieM. 




§ 1 . Weak Convergences of Measures in Metric Spaces 



363 



Assume the contrary, i.e. that such a compactum does not exist for 
the given £>0 and ^>0. We consider an arbitrary measure and 

let be a compactum such that /j,^ < s. Since sup ^ 

a measure jU 2 gM can be found such that Hence a com- 
pact set can be found such that In view of the 

above assumption 

supfi{^\K^/\Kf^) = supm(^\[K('^ u > a 

n n 

Therefore one can find a measure such that e 

and a compactum such that Continuing 

this process we construct a sequence of measures and compacta 
such that Let 

for xeK^^j 2 ZiW = 0 for Since the distance between any 

two sets and X^"”^ is greater than d, = Therefore the 

00 

series gp{x)= ^ Xi{^) is convergent for each xe^ and the function gp{x) 

i = p 

is continuous and bounded by 1. Since a weakly convergent subsequence 
can be chosen from the sequence {g„} we can assume without loss of 
generality that the sequence {g„} is weakly convergent to a certain mea- 
sure g. Then 



Since 



the inequality 



lim gp{x) n„{dx)= \ gp(x)n{dx). 

n-*co J J 

ffp (x) Mn (dx)^j Xn W (dx) >e for n>p. 



gp{x) g{dx)^8 is satisfied for all p. This is, however, 
impossible since gp{x) 0 for all x as p oo and 0 ^ ^^(x) 1, so that in 
view of Lebesgue’s theorem lim gp{x) p{dx) = 0. We have thus proved 

p^coj 

the existence - for each 8 > 0 and 5 > 0 - of a compactum X such that 
p(^\X^)^8 for all peM. Fixing 8>0 we construct a compactum X^'*^ 



such that 



sup 
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00 

Then iC = O will be a compactum and 

r= 1 






The necessity of condition b) is verified. 

Sufficiency. Let the conditions of the theorem be satisfied. It follows 
from condition b) that a sequence of compacta can be constructed 
such that 

+ \ n(u K”^=fi(3:) 

for all /leM and 8„|0. 

Let F be a countable set of functions f„e such that for all m the 
functions { /„} restricted to are everywhere dense in The existence 
of such a countable set follows from the separability of the spaces ^ and 
the feasibility of extending any function in to a function in Let 
be an arbitrary sequence of measures in M. We choose a subsequence 
Ufc such that for all feF the limit 

lim \f{x)n„^{dx) = L{f) 

k->oo J 



exists. We now show that this limit exists for all (pe^^. Indeed, for any 
8>0, ^>0 one can find a function feF such that 
sup{l/(x)-(/)(x)|; ju{^\K^)^e and ||/K1. Therefore 



sup/x(^) + 28, 






and hence 



lim 

k-*^ 00 , 



k-> 00 J 






^48 + 2^ sup 



Since a > 0 and ^ > 0 are arbitrary, it follows that 



lim (p{x) i^„^{dx)=hm | (p{x) fi^^{dx) = 

k-*^cc J k-*oo » 



= lim (p{x)n„^{dx). 

k->Q0 J 



Hence for all (pe^^ the limit lim 

k-^ao 



(p{x) finj^{dx) exists. 
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We shall denote this limit by L{(p). Clearly L{cp) satisfies conditions 
1) and 2) of Theorem 1, Section 2, Chapter V. Next if cp = 0 for xeK^, 
then 



\L{cp)\ = 



lim 

k~* 00 










Therefore condition 3) of this theorem is also satisfied. Hence the measure 
/i exists such that L((^) = J (p{x) fi{dx). The sufficiency of conditions of 
the theorem is thus established. □ 



Remark. The completeness of the space ^ was not utilized in the course 
of the proof of the sufficiency part. (It was also not utilized in the proof 
of the sufficiency part of Theorem 1, Section 2, Chapter V.) 

Corollary. If ^ is a complete metric separable space and the sequence of 
measures p„ is such that for all (pe^^ the limit 



L((^)= lim 

n-*co 




exists, then there exists a measure p such that 



L{q>) = 



(p{x)n{dx). 



i.e. the sequence of measures is necessarily weakly convergent. 

Proof. We first show that the set {p„} is weakly compact. Condition a) of 
Theorem 1 is satisfied for this set. Assume that condition b) is not satisfied. 
In the same manner as in the proof of the necessity of condition b) of 
Theorem 1, we can construct for some e>0 a subsequence p„^ and 
compacta located at a distance at least 5 from each other and such 
that p„^{K^^^)^8. Let the functions XiW be defined in the same manner 
as in the proof of Theorem 1. Define for each prime p the function 

00 

•ApW= Z Xp»-W- 

m= 1 

Functions i//p(x) satisfy relations 0^i//p(x)^l and = 0 for 

p^p';il/pG^^ and hence the limit 

L(i/^p)= lim i/^p(x) fi„^{dx) exists, 
k-^oo J 
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Note that for = 



I 






Xk(x) 



Therefore 

L(iAp)=lim il/p(x) fi„^„(dx)^e. 

m-^ooj 

Hence for any N 

(here Pi,..., ^^e distinct primes), since L((p)^L(/) for (p^f. This con- 

tradicts the fact that L(l) is finite. Hence condition b) of the theorem is 
fulfilled. Let be a weakly convergent subsequence and let p be its limit. 
Then 



L{(p)=\imj (p{x) fi„^{dx) = j (p(x) fi{dx). 

The assertion of the corollary is thus established. 

We now consider the relation between the weak convergence of 
measures and the convergence of the values of measures on individual 
sets. 

Definition. Let jibe a finite measure on ®. The set A is called the set 
of continuity of the measure p if fi(A') = 0, where A' is the boundary of the 
set A. Hereafter, we shall use the notation [>4] to denote the closure of A 
andlntA for the set of interior points of A. 

Theorem 2. In order that the sequence of measures p„ converge weakly to 
measure p it is necessary and sufficient that for each set A which is the set 
of continuity of measure p the relation p„{A) p(A) as n oo be satisfied. 

Proof We first establish the necessity of the condition stated in the theo- 
rem. Let p„ converge weakly to p and let A be an arbitrary set in S. We set 

0'«.W=exp{-me(x, A)}, 

where q{x, A) is the distance from the point x to the set A. Since 1 

for xe[A'] and for we have 

lim I* gJx)fi(dx) = fi({A']), 
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and hence for every e>0 an m can be found such that 



/% 






Therefore 







gf„(x)/t(dx)-s=lim 

n~* 00 



9m (x) 9n {dx) -e-^\\mn„{A)-E, 

n~* 00 



and hence since e > 0 is arbitrary 



lim /t„(/l)</t([^]). 



Thus 

lim ii„{3^\A)= lim ^ = iit„{A). 

n~* CO n~* CO n-^co n^co 



Taking into account that /x([^\y4]) = /i(^) — yu(Int^), we obtain 

/t (Int < to ( A) ^ to /i„ (^) :^ /t ([^]) . (1) 

n-^co n-*co 



Since fi{lntA) = fx{\^A~\) for the set of continuity of measure fi the necessity 
of the condition of the theorem follows from (1). 

To prove the sufficiency we take an arbitrary function / in The 
boundary of the set {x:a^f (x)<b} belongs to the set {x: f{x) = a}u 
u {x:f {x) = b}. The sets A^ = {x:f (x) = c} are disjoint for different values 
of c. Therefore there exists at most a countable number of c satisfying 
fi{A^)>0. We choose a sequence of numbers aj,, k=i,..., N such that 
9{Aa^)=0, a^<a^+^<a^ + e, ai< - WfW, aN>\\f\\. Denote by the set 
{x:at</ (x)<a^+i}. The sets are the sets of continuity of measure 

Therefore -> Hence 



lim 



f{x)n„{dx)- 



f{x)fi{dx) 



< lim 



J 



I 



N-1 



f{x)9n(dx)- X ak9n{Ek) 



k=l 



+ 



f(x)fi{dx)~ X a^{Ek) 



<2e Yj 9-(Ek) = '^9(^)- 
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Since a > 0 is arbitrary we obtain the proof of the sufficiency condition. The 
theorem is thus proved. □ 

We now present theorems dealing with the weak convergence of 
measures which utilize the condition of weak compactness. In all these 
theorems the following single fact is utilized : a weakly compact sequence 
which possesses a unique limit point is weakly convergent. 

We say that sequence is weakly convergent to f if the functions 

/„ are jointly bounded and f„(x) converges to f{x) for all xe^. Using 
this notion we naturally define a weakly closed set of functions and a weak 
closure of a set of functions. 

Theorem 3. In order that a sequence be weakly convergent to a certain 
measure g it is necessary and sufficient that it be weakly compact and that 
for some set of functions whose weak closure coincides with 

for all fepQ the following relation be satisfied: 



lim 

n-* 00 




H„(dx) = 



f{x)n{dx). 



Proof We show that all weakly convergent subsequences of sequence 
g„ converge in measure to g. Indeed if for all/e^^ 

lim |/(x) n„^{dx)=^ f{x) fi(dx) 



then for fepQ the equality 

f{x)il{dx) = ^ fix) ^l{dx) 

is satisfied. However it follows from the Lebesgue theorem on bounded 
convergence that the set of fe^^ such that 

|/(x) /i(<ix)=|/(x) fi(dx) 

is weakly closed. Hence this relation is valid for all / belonging to the 
weak closure of Fq, i.e. for all/e^^. The sufficiency of the conditions of 
the theorem is thus established while the necessity is obvious. □ 

In the case when measures correspond to random processes, it is 
convenient to apply theorems in which the convergence of marginal 
distributions is postulated. We now prove a general theorem which 
enables us to deduce corollaries of this kind. 



Theorem 4. Let g„ be a weakly compact sequence, gbe a certain measure, 
2Io be a class of open sets closed under finite unions and intersections and 
satisfying conditions : 
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1) G-closure of 9lo contains all the open sets, 

2) all the sets belonging to 9Io are sets of continuity of measure g. If 
\\m g^{A)= g{A) for all ^gSIq, then g„ is weakly convergent to g. 

n-*ao 

Proof. Let g„^ be a sequence weakly convergent to the measure g. It 
follows from (1) that for all Ae^q the relations 

ii(A) = fi(lnt A) = 

k-*co 

are satisfied. Therefore for all Ae^q the inequality g(A)^g(A) is valid. 
Clearly this relation is fulfilled for a monotone class of sets and this class 
contains all the sets belonging to SIq. Therefore this inequality is satisfied 
for every open set as well. And since each closed set is an intersection of 
a decreasing sequence of open sets, we have g(F)^g(F) also for every 
closed set F. Therefore for all sets ^g91o we have g(A')^g(A') = 0. But 
then g(A)= lim g„^(A) = g(A). Since the measures g and g coincide on 

k-* 00 

9lo they also coincide on ®. Thus all the limit points of the sequence g„ 
coincide with g. The theorem is thus proved. □ 

Remark. In the case of measures on different functional spaces the class 
is usually chosen to be the class of all open cylindrical sets of con- 
tinuity of measure g. 

Given the weak convergence of measures we may establish the con- 
vergence of integrals for some discontinuous functions. Here we are 
utilizing the fact that the set of points of discontinuity of a ©-measur- 
able function is also a ©-measurable set. 

Lemma. If g^ is weakly convergent to g, then 

lim I / (x) n„ (dx) = I / (x) (dx) 

n->-co J J 

for every ^-measurable g-almost everywhere continuous and bounded 
function f{x). 

Proof. Let A be the set of points of discontinuity of function f{x). Set 
^^ = {x: f[x) < a} and let be the boundary of the set . For a < ^ the 
set n is contained in the intersection of the sets J n and 

therefore for x g n the inequalities lim inf / ( j) a, lim sup f(y)^P 

y^x y->^x 

are satisfied. Thus ^'^n^pCzA and the sets ^'fA with different a are 
disjoint. Therefore at most a countable number of a exists such that 
g(^'^) = g(^'fA)>0, i.e. all the sets except possibly a finite number of 
them are sets of continuity of measure g. Therefore for all a except pos- 
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sibly a countable number of them 

lim /i„({x:/(x)<a}) = ^({x:/(x)<a}). (2) 



We obtain from the formula of change of variables in integrals that 



/(x)/i„(dx) = j' a4/i„({x:/(x)<a}), 
[f{x)n (dx) = f ocd„/i({x :f (x) < a}) . 



( 3 ) 

( 4 ) 



The assertion of the lemma now follows from equalities (2)-(4). □ 

Consider finally conditions of weak convergence of measures in linear 
normed spaces. Let ^ be a separable Banach space, L be a linear set of 
linear functionals on such that the minimal cr-algebra with respect to 
which all the functionals / and L are measurable coincides with the cr- 
algebra of all Borel sets ® of the space Denote by Xn{^) z(0 the 

characteristic functionals of measures and fi respectively: 



z „(0 



= j* 



/x„(<ix), 



z(0= 






Theorem 5. In order that a sequence of measures converge weakly to 
measure g it is necessary and sufficient that the sequence be weakly 
compact and that for all leL the equality 

limz„W=z(0 (5) 

00 

be satisfied. 

Proof The necessity of conditions of the theorem is obvious. To prove 
sufficiency we can show, as before, that each limit point of sequence 
coincides with p. Let Ji be such a limit point, then for all /eL 

Z(0 = X(0 = J 

The equality p = p thus follows from the fact that the characteristic 
functionals coincide. □ 



§ 2. Conditions for Weak Convergence of Measures in HUbert Spaces 

In this section ^ denotes a separable Hilbert space and S denotes a cr- 
algebra of Borel sets in Measures on ® are considered and the con- 
ditions for weak compactness and weak convergence of these measures 
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are studied. As it is seen from the results of the preceding section, the 
basic difficulty is to obtain conditions for weak compactness of the 
family of measures. It turns out that in the case of a Hilbert space one 
can state necessary and sufficient conditions for the weak compactness 
of a family of measures in terms of characteristic functionals. 

Hereafter the following notation will be used for certain families of 
linear operators m^\T^ will denote the set of all symmetric nonnegative 
completely continuous operators, S - the set of all kernel operators and 
Sa - the subset of S consisting of operators with the trace at most a. 
First we prove a convenient criterion of compactness of a set in 

Lemma 1. For any AeT^ the set {x:|^~^x|^l} is compact. For each 
compactum one can find an operator A eT^ such that Kcz {x: ^x| 

<!}• 

Proof. The set {x: 1^" ^x| ^ 1} is the image of a unit sphere under trans- 
formation A and is compact in view of the complete continuity of operator 
A. Let K be a compact set and {ck} an arbitrary orthonormal basis in 
Set x^=(x, Cj^. It follows from the compactness of K that a sequence 
c„|0 exists such that Y, for all xeK. We now choose numbers 

k^n 

d„lO such that the relations ^ -f oo and Y d„c^< co will be satisfied. 

n n 

Then 

00 k 00 00 

Z Z Z Z (^■')^^ Z 4c/t<co. 

k=l j=l k=l j^k k=l 

We define an operator A for which e^^ are eigenvectors and 



Ae^ = 



' Z dfj 

J=1 

k 

i= 1 



Then 



Z dj 
Z djCj 

j=l 

for all xeK. The fact that A is sl completely continuous operator follows 
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from condition 



lim X; 

k-»co j= 1 



i= 1 



= 0 . 



The lemma is thus proved. □ 



Theorem 1. Let M be a family of measures, zeSC , be the char- 

acteristic functional of the measure neM. For weak compactness of the 
set M it is necessary and sufficient that: a) be bounded for peM ; 
b) for any ^>0 one can find an operator B, BeT^ and for each peM one 
can find an operator A^eS^, such that Re[x^(0) — x^(z)] provided only 
(BAffiz, z)^l. 

Proof The necessity of condition a) follows from the fact that for a weak 
compact set M condition a) of Theorem 1 in Section 1 is satisfied. We 
now establish the necessity of condition b). We can assume without loss 
of generality that 1. In view of Theorem 1, Section 1, one can find 

for a weakly compact set M a compactum Ke^ such that for all jugM 
the inequality — K)< s/2 is satisfied. Then 



Re[z^(0)-x^(z)] = 



J 



8 1 

[1 — cos(x, z)] 

K 



(z, xY (i{dx). 



(1) 



Let B be an operator in 7^ such that Xc{x:|B“^x|:^,y£}. Such an 
operator exists in view of Lemma 1. Let A^ be a nonnegative symmetric 
operator for which 

{BA^Bz,z) = ^ I {x,zYn{dx). (2) 

Then 



, , 1 
(x4^z, z) = - 



{x, B ^zf p{dx)=- 
£ 



(B ^x, zf p{dx) 



and 



^P'^^=Ti^n^k,ek) = 

k 



1 

8 



I 



\B ^x\^ fi{dx)^l, 



i.e. /4^eSi. Inequality 



( 3 ) 
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follows from relations (1) and (2). The necessity of condition b) is thus 
proved. 

We now prove the sufficiency of the theorem’s conditions. The 
boundedness of ju(^) follows from condition a). It follows from Theo- 
rem 1 of Section 1 that it is sufficient to show the existence, for each ^ > 0, 
of a compactum K such that — K)^s for all fieM. Let be an 
operator such that for all jieM, the inequality Re[x^(0) — /^(z)]^e/2 is 
satisfied for {BA^Bz, z):^ 1, where A^eS^. Then 

Re[z^(0)-z^(z)]<^ + 2(B/l^Bz, z). (4) 



We now bound the integral 



1-exp^ --{B-^x, x) 



lii(dx) 



where >^>0. Let ^ 2 ? ••• a complete orthonormal sequence of eigen- 
vectors of operator B and let be the eigenvalue corresponding to ej,. 
Then 



00 (yk\2 

k=l Pk 



where x^ = (x, e^). Note that 



A " {x^f 

^fc=l Pk 



= {2nX) 






exp\i X t, Pi{z'‘f\dzK..dz", 

k=l ^^k=l 



where z\..., z” are real variables. Therefore 



I 



[l-exp^--(B ^x,x) 



p{dx)-~ 






i X x^z^' 

1 - ^“=1 



1 " 



X exp| dz\..dfn(dx) = 
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xexpj-;^ Z dzK..dz’‘^]im(2nX) fl ft x 

X J. • • ^ + 2 ^BA^B 2*6(0 ^z x: 

xexpj-^ Z PU^'‘f^dz\..dz" = 



=%2i lim Z (^^6*, eJ=%2/lSp/l^<^+2A. 

Z n-^oo J Z Z 



Hence, 

l-exp| I /t(</x)< 

(B “ ^x, x)>C 

1 — exp|— x)| in{dx)^--\-2X. 

(B-2jc, x)>c 

£ + 4^ 

We choose A>0 and C>0 in such a manner that ^<8. 

2-2e,p{-f} 

Then for all jueM the inequality ju({x:IB~^x\>C})<£ is satisfied and 
hence K = {x:\B~ ^^x\^C} is the required compactum. The theorem is 
thus proved. □ 

Remark 1. A simple example due to Yu. V. Prokhorov and V. V. Sazonov 
shows that for a weakly compact family of measures one cannot always 
find, for each oO, an operator TgS (the same for all measures iieM) 
such that Re(x^(0) — for (Tz, z):^l. Let iC be a compactum of 
the form K = 1}, where BeT^ and SpB^=+oo. We define 

the measures by the equalities jli({x}) = ill{{ — x})=^, and fi(^\{x}\ 
{ — x}) = 0 (here {x} is the singleton containing x). Consider the family 
of measures M = xeK}. Since ^(^\K) = 0 for all jugM, it follows 
that M is a weakly compact set. Next we have 

Re (0) - ( 2 )) = 1 - cos (x, z) = 2 , 

(B~^x Bz) 

supRe(z^jO)-z^Jz))= sup 2 sin^ = 

fisM \B-^x\^l Z 
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= sup 2 sin^ = 

|y|<i 2 



2 sin 



2lBzl 



\Bz\<n. 



Let A be an operator such that Re(x^(0) — %^(z))^ 1 for {Az, z)^l. Then 
\Bz\ <71 for {Az,z)^l. Hence \BA~^^^ z\<7t for |z|^l, i.e. BA~^^^ is a 
bounded operator. Therefore B = CA^^^, where C is a bounded operator. 
Since B = B* = A^^^C*, then B^ = A^^^ C*CA^^^ and for an orthonormal 
sequence {e^} of eigenvectors of operator A we obtain 

X {B\, e,) = E {C*CA^I^e„ ||C*C|| ^ (Ae„ e,). 

k k k 

Hence we necessarily have Sp^= + oo. 

Condition b) of the theorem appears to be somewhat cumbersome. 
We therefore present certain modifications of this condition. 

Lemma 2. In order that a family of operators belonging to S admit 
representation in the form C^ = BA^B where BeT^, necessary 

that in each orthonormal basis {e^} the series 






( 5 ) 



be convergent uniformly in p and it is sufficient that the series be uniformly 
convergent in at least one basis. 

Proof Sufficiency. Let {e^ be the orthonormal basis in which series (5) 
converges uniformly. Set 

00 

e„ = sup E 

^ k = n 

and choose a sequence a„ > 0 such that ^ a„ = + oo and ^ < oo . Then 

CO n 00 00 00 

E E <^k{C^e„,e„)= E E {C^e„,e„)^ E 

n=l k=l k=l n=k fc=l 

Next let ^ be a symmetric operator such that Bey^ = X^ej^ where = 

/ 00 k \ 1/2 

= ( Z ^nQnlYj ; since ^^0, Set = Then 



n = 1 



1 



E ^fc)=E = 

k=l k=l H 



= E E e,,)/ Y, 



The sufficiency of the lemma’s condition is thus shown. 
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Necessity. Let = BeT^, and {/^} be any orthonormal 

basis. Denote by the projector on the linear subspace spanned by the 
vectors A, /n + i? •••? and let B^ = BPj^. Then 

00 00 

X (BA^Bf„A)=Y Pi,BA^BP^f„f,) = Sp(B%A^B^) = 

k=N fe=l 

= Sp{BfA;)^\\Br\\ SpA^II^^II^ 

since Sp.4P = SpP*^* and SpylP^SpP||y4||, if BeS. 

To complete the proof we observe that as N^co. Indeed, 

since the set K of vectors of the form Bx for |xl ^ 1 is compact and the 
functions | Pp^yl are continuous and monotonically decreasing to zero, it 
follows that sup {1 P^y\, yeK}-^0 also as N-^co. But sup {| P^yl, yGK} = 
= sup I Pj^Bx\ = ||P;f II , The lemma is thus proved. □ 

1x1^1 

Denote by *S* the set of operators D such that the sum ^ \{Dej^, c^)! 

fe 

is finite and bounded for all orthogonal bases {ej^}. Denote by Sp \D\ the 
supremum of this sum. Denote by Sf the set of all operators in S'* such 
that Sp|D|^e. 

Corollary. Let the family of operators admit representation for any 

8>0 in the form C ^ = B^^'^ Af B^^^ A where 

Then an operator BeT^ exists such that C^ = BA'^B and A'^eS^. 

Indeed, it follows from Lemma 2 that it is sufficient to show that for 
some orthonormal basis {e^} the series ^ e^) converges uniformly 

k 

in p. But for every 8>0 

k^N k^N 

and as it follows from Lemma 2 by choosing N sufficiently large the sum 

k~^N 

becomes less than e simultaneously for all N. 

Denote by the Hilbert space of linear Hilbert-Schmidt operators 
defined on ^ (i.e. operators C such that SpCC*<oo) with the scalar 
product 

{A,B) = SpAB*. 

Lemma 3. 1) If BeT^, and B^ = BA^B then the set of operators 

B]j^ is compact in IIT ^ . 

2) For any compact set in of operators C„ one can find an operator 

BeT, such that B-^C^B~^eS,. 
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Proof. 1) Let {ej,} be a basis of eigenvectors of the operator B, let Pjy 
be the projector on the subspace spanned by the vectors Then 

IS sup = 

N-^co IX 

= Iim sup [Sp B - Sp PnK'^'\ = 

N-^oo n 

= IE sup[Sp(/-F^) B^ + SpP^B^l^I-P^) 

N-»oo n 

^ UE sup [Sp(/-P^) S, + Spfii/^(7-P^) = 

N^ao n 

= 2 Urn supSp(/-Pjv)P^ = 2 Uin sup ^ {B^e^,e^=Q 

N-*oo IX N-^oo n k>N 

in view of Lemma 2. Hence the set { P^} is a 8-net for the set 

for N sufficiently large. The compactness of the set {P^Bl^^P^} follows 
from the fact that it is a bounded set in the space of dimensions. 
Assertion 1 of the lemma is thus proved. 

2) Let Cl, C 2 , Q be an e-net in the set {C^}. Denote by the 
operator with the smallest index k for which Sp (C^ — Cj,) (C^ — C^) e^ . 
Then = C'^ -h where 

^.=q(c,-c;)+(c,-c;)c;+(c,-c;f. 

It is easy to see that D^eS* and 

Sp |Z)J <2 VSp c;^ Sp(C, - c;)^ + 8^ =0(e). 

Now note that C^^ takes on, for distinct ju, only a finite number of values 
and for each the series ^ {C^ek, ej,) converges for any orthonormal 

k 

basis therefore this convergence is uniform in fi. To complete the 
proof of the lemma we use the corollary of Lemma 2. □ 

The lemmas proved above enable us to find a more efficient condition 
of compactness of measures (as compared with that given in Theorem 1). 

Theorem 2. Let M= {fi} be a family of finite measures on SB. In order that 
M be weakly compact it is necessary and sujficient that 

1) for every oO there exists c such that p{x\\x\>c]<£ forall ueM; 

2) for each c the family of operators defined by the relations 

f 

{z, xf fi{dx) = \B^^z\^ , 

is a compact set in Condition 2) can be replaced by the following: 

2') the series 

00 

Z 

k = l 
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is convergent uniformly in /i for each oO in some basis ( and hence in every 
basis ) . 

Proof Without loss of generality we shall assume that ju(^)=l. The 
sufficiency follows from Theorem 1 since 

Re(l-x^(z))<i I {x,zfn{dx) + 

|x|«c 



+ /i({x:|x|>c})<i + 

if c is sufficiently large, and in view of assertion 2 of Lemma 3 = 

— where Bef and < 1. We now show the necessity of con- 

ditions of the theorem. Let M be a compact set and X be a compactum 
such that — X)<£ for all geM. If c is such that \x\^c for xeK it 
follows that ^({x:|^|>c})<8. The necessity of condition 1 is proved. 
Next putting ]^ = {x:|x|^c} we obtain 



j* {z, xf fi{dx)+ j* (z,xYn(dx). 

KnVc Vc\K 

Let X = {x:|J3“^x|^l}, where Bef. Then 

{z,xY fi{dx)^ J {B~^x, BzY fi{dx) = \A^Bz\^ 

KnVc 

where operator A^ is defined by the relation 



\A^z\^= {B ^x, zf jLi{dx). 



Therefore 



Thus 






SpA^= \B ^x\^ g(dx)^l. 



\B-^x\^l 



J (z, xf jLi{dx)^\A^Bz\^, 

KnVc 

where A^eS^, and BeT^. 
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On the other hand if D is defined by equality 

(z,xy^l{dx) = {Dz,z), 

Vc\K 



then 



SpD= 



I 



Vc\K 



\x\^ 



By choosing compactum K in an appropriate manner, we can make 
this quantity arbitrarily small uniformly for all jj,. Therefore in view of 
the corollary to Lemma 2 and assertion 1 of Lemma 3 the totality of 
operators {B^} is compact in The necessity of condition 2) is thus 
proved. The necessity of condition 2') follows from Lemma 2). The theo- 
rem is thus proved. □ 



Corollary 1. Let the correlation operators 



(A^z, z) = J {z,xY jii{dx) 

exist for measures peM and let Then in order for the family of 

measures M to be compact, it is sufficient that the family of operators 
be compact in This condition is also necessary if one can find 
oO such that ju({x:|x| >c}) = 0 for all peM. 

Corollary 2. Let operator A^ be defined by equality 
(A^z, z) = | 

Then in order that a family of measures M be compact it is necessary and 
sufficient that: 

1) the set of operators {A]J^] be compact in 

2) lim supju({x:lxl >c}) = 0. 

C-+ 00 n 

We now state a convenient condition for weak convergence of mea- 
sures. 

Theorem 3. In order that a sequence of measures p„ converge weakly to 
measure p it is necessary and sufficient that: 

1) characteristic functionals of measures p„for allied converge 

to the characteristic functional ;^(z) of measure p. 
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2) the family of operators defined by equation (6) be a com- 

pactum in 

Proof The necessity of the conditions of the theorem follows from 
Corollary 2. In view of Corollary 2 and Theorem 5 of Section 1, in order 
to prove sufficiency one is required only to show that 

lim lim ({x : |x | > c}) = 0 . (7) 



Let 

Vn(A)=H„i\x:- f^ eAM. 

J/ 

Since 

j* (z, xf V„(dx) = | ti„{dx) = (A^z, z), 

the family of measures v„ is compact. Relation (7) is equivalent to the 
following: 

lim limv„({x:|x|>l— e}) = 0. (8) 

6^0 n-*-oo 

Assume that (8) is not satisfied. One can find a weakly convergent sub- 
sequence such that its limit v satisfies condition v({x:|x| = !})>0. 
Consequently for some z, |z| = 1 and 5>0 we have 

v({x:|x| = l, |(x, z)|>(5})>^. 

Then for all a>0 and for n^ sufficiently large, 

v„k({^:|x|>l-£; i(x, z)|> 5 })>( 3 , 

and hence 






|x| 



Therefore for each a >0 









fc-^0 






2e 



On the other hand for each z 
lim lim p„{{x:\{x, z)| >c}H 
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^ lim lim 

c-*co n-*co 



n 

n—l 






sin - (x, z) 
c 



n 



(x,z) 



Hn(dx) = 



itjc 

= limHm-^;^ (1-X„(tz))dt = 

c-*-oo n^aoTl — 1 Z7l J 

— n/c 
njc 

= lim;r- ^ [ {l-x{tz))dt=0. 
c^coZTZ — Z J 

— njc 

The contradiction obtained proves the theorem. □ 

Remark. If a sequence of measures ]u„ is weakly convergent to measure 
/i, then Xn (^) X (^) uniformly for |z| ^ c for any c> 0. Let B be an operator 

in 7^ such that operators A„eSi can be found with the property that 
Re(l-x„(z))<£V8 for {BA„Bz,z)<l. If |jBz|< 1, then {A„Bz,Bz)<l all 
the more. Therefore 



IZ„(Zl)-Zn(Z2)f < 



l^nidx) = 



= 2 



(l-cos(zi-Z 2 , x)) fi„{dx) = 2 Re(l-x„(z))<- 



only if \Bz^—Bz2\<1. Since the set {Bz:|z|^c} is a compactum, there 
exists a finite collection of points z^, ..., z^ such that inf|j5z — Bz^l < 1 for 

k 

all z, |z|^c. Then 

Km sup \x„ (z) - X (z)\ < Iim sup \x„ (z^) - x (zt)| + 

n-*co |z|^c n-^oo k 

+ 2lim supsup{|x„(z)-x„(z*)|; |B(z-Z;fc)|< l}<e. 

«-^oo k 



Our assertion is thus proved. □ 



§ 3. Sums of Independent Random Variables with Values in a Hilbert Space 

In this section we shall consider, in addition to probability measures on 
Hilbert spaces, random variables with values in Hilbert spaces having 
these measures as their distribution functions. Let {Q, 91, P} be a prob- 
ability space and S} be a Hilbert space with a cr-algebra of Borel 
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sets. A function ^{co) defined on Q with values in such that {co : ^(co)eB}e 
g 21 for all 5 g® is called a random variable with values in Hereafter 
we shall write ^ in place of The distribution of the variable ^ is the 
measure 



(B) = P { ^ £ B} = P { 0 ) : ^ (co) 6 B} . 

With each random variable ^ one associates a subalgebra 91^ of algebra 
91 of events of the form {co:i{w)eB} where B is an arbitrary set in ®. 
The random variables - called independent if the tr- 

algebras of events 91^^, 91^2’ • . 91^ , . . . are independent, i.e. for any events 

p{n xj=np{.4,}. 

We shall find expressions for characteristics of the sum of independent 
random variables in terms of characteristics of the summands. The 
function 



(z) = E e' = j* e‘ /c^ (dx) , 

is called the characteristic functional of a random variable ^ i.e. this is 
the characteristic functional of the distribution of the variable 

If (^ 1 , ^ 2 ? • • •» in independent and Xk{^) is the characteristic functional 
of the variable then 

i 

fe=i / 



= n Xki^)- 



(1) 



E exp< i 



Thus when adding independent random variables, the corresponding 
characteristic functionals are multiplied. To obtain the expression for the 
distribution of a sum of independent random variables, consider the case 
of two summands. Let = + ^2 p^^, be the distributions of 

the variables (^, and ^2 respectively. Then 

Ai^(B) = P{^i + ^2£B} = EP{^i + ^2eB|ai^.} = 

= EPfeeB-^i|3I,J = E/i,,(B-^i). 

where B— x is the set of y’s such that x+yeB. Note that n^^{B — x) is a 
S-measurable function. Therefore 



n^^(B-x) n^X^x). 
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Thus we have the formula 







I 






(2) 



i.e. the distribution of the sum of two independent random variables is 
a convolution of the distributions of the summands. 



Convergence of the series in independent random variables. We present 
a number of inequalities which extend Kolmogorov’s inequality and its 
various generalizations to the case of variables with values in a Hilbert 
space. 

Lemma 1. Let (^ 2 ? independent random variables such that 

E^, = 0, E 1^,1^ <00 amn^= ^ Then 

i= 1 

P{sup|g>8}<lE|U^ (3) 

k^n e 



The proof follows from the fact that \Ck\^ is a semi-martingale and 
from inequality (16) in Section 2 of Chapter II. □ 

Lemma 2. If are independent and then for any positive 

integer I and positive a, 

P {sup ICkI > /a + (/ - 1) c} ^ ( P jsup ICfcl > . 

k^n \ Ik^n 1) J 



Proof Let Xk = ^ if — 1) a + (/ — 2) c and — 1) a-f-(l — 2) c for 

i < k and let Xk — ^ otherwise. Then 

P(sup|Cfe|>/a + (/-l)c} = 



= X P{sup|C,|>/a+(/-l)c|z. = l} 

i=l k^n 



< Z sup \Ck-Ci\>a.} P{Zi = l}^ Pte=l}^ 

i = 1 i<k^n 



^ sup P{ sup |Ct-Cil>a} E Pte = l}^ 

i<k^n i=l 



<P{ sup |C^-Ci|>a} P{ sup |Ctl>(/-l)a + (/-2)c}. 

l^i<k^n l^k^n 
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Finally we note that 



P{ sup |Cfc-Cit>a}<PSsup |Csl>->. 

l^Kk^n Ik^n 2) 

The lemma is thus proved. □ 

A variable ^ is called symmetric if ^ and — ^ are identically distributed. 
Lemma 3. If are symmetric independent random variables then 

P{sup|C,|>8}^2P{ia>s}. 

k^n 

Proof Let Xk = ^ if \^k\>^ ^nd \^i\^s for i<k and let Xk = ^ otherwise. 
Then 



P{|C„| >£, Z.= 1} ^ P {{Cn-U, Xu=l} = 



= P{(Cn-Ck, C0^O|z,= l}P{x, = l}<iEx,. 



Therefore 



P{ICJ>£}=E P{IU>£> ^2 E P{supy >e}. 



The lemma is proved. □ 

Lemma 4. Let be independent random variables such that for 

all k^n 

p{|.E 

Then 

P{sup|y>a + c}^^P{|4l>a}. (4) 

k^n 1— a 

The proof of this assertion is analogous to the proof of Theorem 6, 
Section 3, Chapter II. □ 

Utilizing these lemmas we now prove Kolmogorov’s three-series 
theorem in the case of a Hilbert space. 

Theorem 1. Let be a sequence of independent random 

00 

variables with values in . Then in order that the series ^ ^ . converge, it is 

i= 1 

necessary that for any c^O the following series 
1) E a,., «i= j* 
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2) Z J \x-ai\^ 

\x\^C 

00 

3) z pm>c}, 

i=i 

be convergent and it is sufficient that the series converge for at least one 

oO. 

The sufficiency of the conditions of the theorem is proved in exactly 
the same manner as in the one-dimensional case (cf. Theorem 5, Section 3, 
Chapter II). 

The necessity of condition 3) follows from the fact that for each c> 0 
only a finite number of events > c] occur and from the Borel-Cantelli 
lemma. We shall prove only the necessity of conditions 1) and 2). Let 

for |(^fc| ^ c and ^k = 0 for \^j^\ > c. Since condition 3) yields that only 

00 

a finite number of variables ik~^k vanish, the series ^ is 

fc=i 

00 

convergent as long as the series ^ is convergent. Therefore the vari- 

k=l 

n + p 

able sup ^ is bounded. In view of Lemma 2 for all positive integers / 

n,p k = n 

pjsup Z^<^k >/(c+a)|^^p|sup Z ‘^'^^ 5 })- 

We choose a such that 

r al 

P]sup Z ik >-Ke 

Ln,p k = n 2 ) 

Then for all n 

ns 

where A = , K = It follows from this inequality that E Yj ^'k 

c + a k=i 

are uniformly bounded for all s and hence in view of the theorem on the 
passage to the limit under the sign of an integral, the limit 

ns 00 s 

limE Z =E Z ^'k 

n-*cx) fc=l k=l 
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exists. In particular for s = 2 the limit 

in 12 / n 



limE 

n-^ 00 






= lim 

n-^ 00 



I 

1 






E 



00 

Consequently, the series ^ — is convergent. But now it follows 

fe=i 

from the sufficiency of the conditions of the theorem that the series 

00 00 

Yj i^k ~ ^k) is convergent and since the series Y ik is also convergent, 

fe=l k=l 

00 

SO is series Y The theorem is thus proved. 

k= 1 



Corollary. In order that the series Y of independent random variables 



k=l 



converge it is sufficient that the series Y ond Y E^l' 

k=l k=l 



converge where E^^ = 



xp^k(dx) is a vector in such that for all ze SI 



The conditions of convergence of series of independent random 
variables can be expressed in terms of characteristic functionals as well. 

Theorem 2. Let be independent random variables and let 

Xn{z) be their characteristic functionals. For convergence of the series 
00 » 

y it is necessary and sufficient that the product Xki^) converge 

k=l k=l ■ 

uniformly in each region {z : \z\ < c} to a certain characteristic functional 

x{^)- 

n 

Proof Necessity. Let X! 1™ Then for each 3>0 

k=l 



n zt(z)- n Xk(z) 



\k=l k=l 






< E _ l| ^ 2P{|C - C„| > +.5 |z| . 



Since lim P{|C — CnI the necessity of the theorem’s conditions is 

n-^cc 

proved. 

Sufficiency. We introduce mutually independent random variables 
which are independent of 4 having the same distribution as Set 
= c^j^.We first prove the convergence of the series in rjj^. Clearly, 
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It follows from inequality 

n 00 

1- n n izk(z)i^=i-iz(z)i^ 

fc=i fc=i 

and the fact that \x{^)\^ is a characteristic functional of a certain measure, 

n 

that the distribution of variables ^ rjj^ forms a compact family of mea- 

k=l 

sures. Therefore 



lim supP 

C-+ 00 n 

But then in view of Lemma 3 

lim p|sup 



Z nk 

fc=i 



>c} = 0. 



lim P^sup 

C-* 00 I n, p 



n + p 

X 



k = n 



Z fik 

k=l 






^2 lim supP 



C-* 00 n 



X fik 

k=l 



>-=«, 



Therefore the variable sup 



n + p 

X nk 



is bounded. In particular, the variable 



sup|f/fc|is bounded. Since for c sufficiently large 



0<P{sup|f/fc|^c}= n (1-P{l^fcl>c})<exp^ - Z ^{\rjk\>c}\, 

k=l 



it follows that the series Z ^{\^k\'>^} is convergent. Let r]k = f]k 

k= 1 

\flk\ and t]l=0 for \rj^\>c. Then the variable 



sup 

n,p 



n + p 

X n'k 



is also bounded because Y]k = f]k except possibly for a finite number of 

00 

indices k. From this fact the convergence series Z ^\^k\^ follows (since 



k= 1 



Ef/fc = 0) analogously as was the case in Theorem 1. Hence in view of 
Theorem 1 the series 



X ik= X iik-i'k) 

k=l k=l 
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is convergent with probability 1. Therefore a sequence of vectors Xi,..., 

x„... belonging to ^ (which are the possible values of can be found 

00 

such that the series ^ (ik — ^k) will converge with probability 1. To 

k= 1 00 

complete the proof we must establish the convergence of the series ^ x^. 

fc=i 

In view of the necessity of the conditions of the theorem which were 
proved the following infinite product 



Y\e- 

k=l 



i(z,Xk) 



Xki^) 



is uniformly convergent for \z\^c. Since a d can be found such that 
\x{z)\ for \z\^d, there exists the uniform limit 

n 1 ” 

lim n = f] e-**" *-’ X;t(z). 

n-*cOk=i »^^X{^)k=l 



Therefore there exists the limit lim ( X ) uniformly in z for |z|^5 

k=l / n 

and hence uniformly in z for |z| ^ c for any c> 0. This implies that Y, 



k=l 



possesses the weak limit x: 

limlz, Yj 



and that moreover Y 

k= 1 

form convergence that 



are jointly bounded. It follows from the uni- 



lim( Y ^k< Y ^k) = lini( Y x^,xj=(x,x). 

n-^co\k=i k=l / ”~"^\k=l ) 



Therefore Y ^k converges weakly to x and 

k= 1 

-►X. The theorem is proved. □ 



z ^k 

k=l 



► |x|. Thus Y ^k~ 



00 

Corollary. If the series Y 4 converges in probability it then converges 

k=l 

with probability 1 also. 

Indeed, in the proof of the necessity of conditions of Theorem 2 only 
convergence in probability of the series was utilized. 
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Infinitely divisible distributions in a HUbert space. A distribution (measure) 
^ is called infinitely divisible if its characteristic functional x(^) satisfies the 
condition : for any positive integer n there exists a characteristic func- 
tional x„(z) of a certain distribution such that x(^) = (Xn(^)T' 

We derive the general form of a characteristic functional of an in- 
finitely divisible distribution. 

Let ^ be a random variable with values in ^ such that E = 
and let 1 , . . . , be independent identically distributed random variables 

n 

such that E =Xn(^) Z show that, given an arbi- 

fc=i 

trary 8>0 one can find c such that for all k^n, 

p{|.i ^»7 |>c}<£- (5) 

Let 5 be a kernel operator such that 

l-Rex(z)<^ (Sz, z)<l. 

Then 




Therefore largx(^)| arctg^^^^^ — <-, 

._£ 4 

l-Re(x„(z))"“* = l-|x(z)| " cos|^^argx(z) < 

In J 

g 

<l-|z(z)| cosargx(z)^-. 

Utilizing inequality (7) Section 5 of Chapter V, we obtain 

P { I >c|< (|+2A Sps) ‘ • 

In view of the latter inequality, one can choose X and c such that (5) is 
fulfilled. We now obtain from Lemma 4 that 




( 6 ) 
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Finally 



Hence 



P{sup|^J>4c}<P<sup L 



k=l 



exp<-L P{I^J>4c>^n P{l^ntl<4c}^ 

(. fe=l J k=l 

t P{IU>4c}^log^. 

k = 1 i — Z£ 

The last inequality yields the following 
Lemma 5. For all c sufficiently large 

supnP{|(^„il>c}<co 



lim sup«P{|<^„il>c}=0. □ (8) 

c-^ 00 n 

Define ^hi = ^ni for and ini = 0 for |(^„i|>c, where c is such that 

SUpMP{|4i|>c}<i. 



Pi sup Y. sup ^ >a[ + nP{|<J„i|>c}. 

It follows from (6) and the choice of c that for a sufficiently large 



holds for all n. Therefore in view of Lemma 2 the following quantities 
are uniformly bounded in n\ 



EX and E X ■ 

j=i j=i j=y 
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This in particular yields that 



\n 



( 9 ) 



Denote by //„ the measure representing the distribution of variable 
and let n„ be the measure defined by relation 






|xp 



+ |x|' 



H„{dx). 



Measures n„{A) are uniformly bounded: 






J=1 



+ nP{\Lj\>c}- 



It follows from (8) that measures satisfy 

lim sup7r„({x:|x|>c}) = 0. 



C-* 00 n 



( 10 ) 



We now show that measures are compact. To do this it is sufficient to 
show that for each £ > 0 a kernel operator S exists such that 

7T„(^) — Re J n„{dx)^s, 

as long as {Sz, z)^ 1. But 



7z:„(^) — Re 



n„{dx) = 






= n I (1 -cos(z, x)) fi^{dx)^nll-RQXn{z)]== 



= n 1 — 



|/(z)|^/" cos - argx(z) 



- argz(z) 



+ 1-cos 



<n[l-|x(z)|^/"] + 






\x{z)\ 2n 



+;r- [argz(^)]^ 



Assume that 1— Re/(z)<- (£<1); then |Imx(z)|<^£. Hence in every 
connected region in which this assumption is satisfied, we have 






' £ 71 

|argx(z)|<arctg^^<-, 
s 4 
1 — 

2 
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If S' is a kernel operator such that 1 — Rex(^)<- for (Sz, z)^ 1, we have 
for these z 

7c„(^)-Re I 7t„(dx)^^+ ^ ® 






For n> 1 and s sufficiently small the r.h.s. is less than s. The compactness 
of measures n„ is thus proved. □ 

Let a„ be defined by relation: 







(z» x) 

l + |xp 



/i„(</x) = nE 



1+ia"’ 



It follows from (9) that a„ are jointly bounded. Finally we define sym- 
metric operators V„ by the equality 



{V„z, z) = n 



f* 



l + |xp 



fi„{dx). 



Note that 



SpK„ = n| j^| 2 /i„(rfx) = 7t„(^) 

and hence Sp F„ are uniformly bounded. 

We choose a subsequence n' such that: 1) is weakly convergent to 
7i', 2) a„> is weakly convergent to some vector a and 3) for all z there exists 
the limit 

lim (F„,z,z) = (Fz, z). (11) 



The last condition is attainable since in view of the uniform boundedness 
(II Kill ^SpKi) it is sufficient that (11) be satisfied on a certain countable 
everywhere dense set in Clearly V is also a kernel operator since 

Sp F Im Sp . 

n'-*ao 



Next set n(A) = n'{A) if 0^x4 and 7t({0}) = 0. We then have 



Z(z) = [;f„'(z)]"'= lim 

n'-*oo 




i{a„., z)-^V„.z, z) + 



+ 




i (z, x) 



i(z,x) ^ (z, xf 
1+1x1"^" l+\x\\ 



l+lxp 

|x|2 



n„.(dx) 
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= Qxp<i{a, z)—j{Vz, z) + 



+ 






i(z, x) 1 (z, xY 
l + \xY'^2l + \xY 



1 + N^ 
lx|^ 




, . . / ■, , i(z, x) 1 (z, x)^\l+|xp 

The function ( — 1 — - — — ^+- r 1 is defined by conti- 

V l + |x|2 2 1 + |x|V W" 

nuity at x = 0 to equal 0. Hence 



X (z) = exp <[ i (a, z) - i(Bz, z) + 



where 



Since 



(Bz, z) = (Kz, z)-| 7i(dx). 

(Bz, z) = Jim I v(dx)-| n{dx), 



and for almost all £>0 (such that tc({x:|x| = £}) = 0) 



|x|>e 



7I„.(dx) = 






( 12 ) 



and moreover the integral 



(z, xY 
|xl^ 



n(dx) becomes arbitrarily small for 



a suitable choice of £>0, it follows that {Bz, z)^0 for all z. Thus for any 
infinitely divisible distribution vectors ae^, a, kernel operator B and a 
finite measure n, with 7t({0}) = 0, can be found such that the characteristic 
functional %(z) of this distribution is of the form (12). 

We now show the converse, i.e. that formula (12) determines the 
characteristic functional of a certain distribution. The fact that %(z) is 
positive definite follows from the observation that^(i^^) - where P is the 
projector on a finite-dimensional subspace ^ with z varying in this 
subspace - is the characteristic functional of a certain infinitely divisible 
distribution in J^. 
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Next utilizing relations 

r 1 r 

2 )+ (l-cos(z, x))7r(dx)H-- -^^7t{dx), 



arg X { 2 ) = (a, z) + sin (z, x) n {dx) + 



sin(z, x) — (z, x) 



71 (dx). 



|sint — sin^t^4(l — cost), 



we verify that for some C 
1 -Re;c(z)< 1 -|x(z)| +i(argx(z)f < 






(a, z)^+ (Bz, z) + ( 1 — cos (z, x)) n{dx) + 



+ 



(z, xf 



n{dx)-\- 






|X|^ 

For measure 71 one can find a kernel operator S' such that for each 8>0 



( 1 — cos (z, x)) 71 (dx) < — for (S' z, z) < 1 . 



Setting 



5 - 




(5+(7) + 5', 



where the kernel operator U is defined by the equality 



((7z, z) = (a, z)^ + 



|x|^ 



71 (dx), 



SpL/ = |ap + 7i(^), 



we have: 1— Rex(z)<a for (S'z, z)<l. Hence x(^) is a characteristic 
functional. We have thus proved the following theorem: 

Theorem 3. In order that a functional x{^) be the characteristic functional 
of a certain infinitely divisible distribution, it is necessary and sufficient 
that there exist vectors aeSC, a kernel operator B and a finite measure 
71 on ^ with 7t({0}) = 0 that /(z) is representable by formula (12). 

Remark. The representation of x(^) by means of formula (12) is unique. 
Indeed, 

(Bz, z) = —2 lim In x(tz). 

t^oo r 
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Therefore we shall assume hereafter that 5 = 0. Let {ej,} be a basis and 
the numbers Cj,>0 be such that ^ |cjt|<oo. Then the series 

k 



1 -- 



^it (x, ek) ^ - it (x, ekY 



= E Cfc(l-cost(efc, x)) 



converges monotonically to a bounded function and hence series 
lnx(z + te^)-lnxiz-tej' 



I 



lnx{z)- 



C ^ 1 -f- IxP 

= ^ q[l-cost(et, x)] 7i{dx) (13) 

J k=l 1^1 

is convergent. 

Therefore knowing /(z) one can determine the expression 

(.4) 

V x) / |xr 



which is obtained if the r.h.s. of (13) is integrated with respect to t from 
— ^ up to (5 and then divided by 23. It thus follows that the measure 



*(/!) = 



z 

fc= 1 



c. 1 



sin^(^fc, x)\ 1 + |x| 



3{ek, x) 






n(dx) 



is uniquely determined since (14) is the characteristic functional of this 
measure. The measure n is completely determined by the conditions: 

1) 7t({0}) = 0, 

2) IfO^T, then 



n{A) = 



J 

A 



00 



E 



^k 



sin 3 (Cfc, xj 
<5(e*, x) , 



- 1 



l+|x|^ 



n (dx) . 



Hence the measure n is determined by the values of ;^(z). Therefore a is 
also uniquely determined by the values of %(z). 

A limit theorem for sums of independent random variables. Let ink^ 

be a double sequence of independent random variables and let C« = 

kn 

= Y, ^nk‘ The variables are assumed to be infinitely small, i.e. 

k= 1 

lim sup P >s}=0 for each a > 0. We now find the conditions under 

n~* ao k 

which the distribution of C„ converges as « oo to a certain limiting 
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distribution. Denote by the distribution of the variable the 

distribution of by v„. Next let be determined by the equality 






(z,x) 






for all zG^. The existence of these a„j, and their uniqueness, provided 
they satisfy the inequality \a„k\ <3 < 1 for n sufficiently large, follows from 
the relations 



\Ta\^ 



1^1 

1 + |x — ^^1^ 



^nk(dx)^s-\-\a\ fink({x:\x\>s}), 



where 



\Ta-Tb\^\a-b\ 



^ M l+|x-a|^ 

(2ix| + M + |6|) M 



t^nkidx), 



u„k(dx)^ 



(1 + |x — a|^) (l + |x-6|^) ' 

< |a — fe| [2s^ + 2Ss + Lfi„i,{{x ; |x| > e})] , 



L = sup 



(2|x|4-N + |fo|) W 
(1 + |x-u|^) (1 +|x — h|^) 



; \a\^S, \b\^3,xeX>^2 






(1-^) 



2* 



It also follows from these inequalities that the operator T in the region 
|a|<^<! is a contraction and maps this region into itself Therefore a„j, 
exist and are unique. 

Set 



kn kr, 

an='L a„„, (F„z,z)= X (K.fcZ, z), 

fe=l k=l 

(f^„fcZ,z)= 7— ; ^Mdx), 

and let the measure ii^ be determined by the equality 

[ fx„{dx)= X [ n„,,{dx). 

J k = 1 J 1H-|X 

Theorem 4. In order that a sequence of measures v„ converge weakly as 
n-^co to a measure v it is necessary and sufficient that the following 
conditions be satisfied: 

1) converges weakly to a certain measure n' ; 

2) the limit a = lim exists; 
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3) the sequence of operators is such that ^ (Vn^k^ ^k) converges 



k=i 



uniformly in n and for each z the limit lim z) = (Vz, z) is valid, where 

n-*ao 

V is a kernel operator. Moreover, the characteristic functional of the 
limiting distribution is given by formula (12) with 

{Bz, z) = {Vz, z)- j* n(dx), 

n{A) = n'(A) for O^A, Jt({0}) = 0, 

Proof Sufficiency. We have 

fc=l J 



= e 



Note that 



n )\^-HKkZ,z) + 



,i (z, an) 

fc= 1 V. \ ^ i_ 

i (z, X - a„y l(x- a„t, z)^ ‘ 
) ' ^ 



l+|x-a„tP 2 1 + |x-aJ^ 



^i(z,x-a„k) _ J . 






1 (z, x-gj^ 

2 l + |x-aJ^J 



Mnk(dx) 



I 



l + |x-a„/ 

[cos(z, x-aj- 1] n„k{dx) + 



• rr • t t 1 (J 



+ j I I sin(z, x-a„t)- 7 ^-^^ \ n„k{dx) = 

= 0((F„tZ,z) + (l+|z|)/r„*({x:|x|>l})). 
It follows from the last inequality that 

In E = i(z, a„) -i(F„z, z)+ 
k„ r- 

fc= 1 J L 



J(z.x-a„k) 2 H^’X g„t) ^ 



l + |x-aj^ 



+ X 



1 (z,x-aj^ 



2 l+|x-aJ^J 



Mnk(dx) + 



+ O [sup ( F„fcZ, z) + (1 + |z|) sup {x : |x| > 1 }] . 

k k 
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It is easy to verify that 

lim sup|(3„;tl=0. 

n->^ 00 k 



Moreover 

= j l + 2,52 + \a„f + P {|^„,| > , 

and hence 



lim sup Sp K,=o. 

n-* 00 k 

Therefore, in view of the conditions of the theorem, we have for all z 



lim InE = a) — ^{Vz, z) + 



kn 



+ lim 5 : 



t=l J L 
(z, 



2 l+|x-aj2 



^i(z,x~ank) _ ^ . 

H„k(dx). 



i(z,x-gj 

l+lx-a„/ 



+ 



Note that 

kn 

lim 



^l{z,X-ank) _ I 



+- 



1 (z, 



2 1-Hx-aJ^ 



l + |x-aj' 

l^nk{dx) = 



= lim 



i{z, x) 1 (z, x)^ \ 1 + |x|^ 



r + z 



l + |x|^ 2 1 + |x|V |x|^ 



/i„(dx) = 



= \ \e 



J(z 



, i(z, x) 1 (z, x)^ \ 1 + |xp . ^ 

1 Ml') l±Ji!! 

l+|x|" 2 1 + |x|V |x|" ^ ’’ 

since if we define the function — 1 — - ’ +~ ^ ^ ^ 

l + |xp 2 1+|x|V W" 



\ 

to be zero at x = 0, this function will remain continuous. Hence 

i{z,x)\l + \xf 
' ^7r(dx). 



lim In E c* = i (z, a) — i {Bz, z) + ie' — 1 



n-* 00 



i + |x|V 



To prove the weak convergence of measures v„ it is sufficient to verify 
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that they are weakly compact. However, 

f II fen 

1- v„(dx) 



l_^^(«n,z) I fi„j,{dx)\ 

k=l 

kn 



z)l+ E 



k= 1 



1 - 



kn 

+.E, 



I gi(z,x-a„fc) < 

kn p 

< l(««, z)l + E (1 - cos (z, X - a„fc)) H„k (dx) + 
k=l J 

(z,x-a„t) 



J 



sin(z, x-a„fc)- 



l + |x-aJ^J 



M„i(dx) 



We utilize the bounds 



(a„, z)K<5 + -(a„, zf , 



(1 - cos (z, X - aj) /t„t (dx) ^ 



(x-a„^, zf n^(dx) + 2 



Hnk(dx), 



r \x-aj^ 

J l + |x-a„/ 



|:>c-ank| >C 



sin(z, x-a„t) n„k(dx) 



l+|x-aj^ 

\x-a„k\^ 

l + |x-aj^ 






^<5 I ^ fi^(dx)+^ I sin^(z, x-a„^) n„i,(dx)^ 






I 

! 

g I (z,x-a„^)^ fx^(dx)+^ I H„^(dx), 



^l„k(dx) + 



|:«-a„k|=^c fx-a„k|>c 

(z, X - a n„^ (dx) ^ (1+ c^) ( V„^z, z) . 

|:>c-flnk| 

These bounds yield the following inequality 



1_| v„(dx) ^(H-c2)^l+^^(F„z,z)+^{a„,zf + 



+ ^ (At„ (^) + 1) + ( 2 + - j ju„ ({x : |x| > c - sup |a„fc| }) . 
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Since by a suitable choice of d and c the expression 

(^) + ^ + ^2 + /i„ ({x : |x| > c - sup \aj }) 

becomes arbitrarily small, to prove the compactness of v„ it is sufficient 
to show that the series 

00 00 C» 

k=l k=l k=l 

where the operator S„ is defined by the relation 
{S„z, z) = {V„z, z) + {a„, zf 

is uniformly convergent in n in any orthonormal basis {e^}. The uniform 

00 

convergence in « of the series ^ follows from condition 3) and 

k=l 

00 

the uniform convergence of the series ^ {a„, follows from the fact 

k=l 

that in view of condition (2) 

00 

lim Z! L{a„,ek)-{a,ek)y^O. 

n-^co 

The sufficiency of the conditions of the theorem is thus proved. 

Necessity. Let be random variables, taking on values in 

which are mutually independent and also independent of 
and let i^k have the same distribution. Since 

P{ Z^ >2c|<2p| Z^ U >c} 

and the variables ^„k ~ ^hk symmetric, it follows from Lemma 3 that 

pjsup Z >2cU4p| Z Lk>c\. 

k=l J lk=l J 

Hence 

p {su £ \U - ^'nk\ > 4c} < 4P I > c| • 

f 1 

Choosing c sufficiently large so that4P< ^ >c><l, we have, in 

view of the inequality — ln(l — x). 
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k k 

I E lnP{|^„*-a^4c} = 

k=l fe=l 

= - In P { sup I J < 4c} ^ 

k^kn 

Consequently, vectors b„k can be found such that 

Z P{|^„,-M>4c}<-ln(^l-4p| >c|). 

Since sup P{|c^„fc| >a} -^0 for every 8>0, we have |h„^|^c + e, provided 

k 

only 
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are uniformly bounded in n (provided c are sufficiently large). But for all 
5>0 

inf f \x-a\^ n„j{dx). 

|x| <c-5 

Denote by d^j the value of a for which the infimum is obtained. For n 
sufficiently large 

xn„j{dx). 

\x\<c — d 

Since 




infP{|<^y<^} = infP{|^„,.|<^}-l 

j j 

as n 00 , it follows that 

kn r 

sup ^ |x-dJ^/i„fc(dx)<oo. 

n k=l J 

|x| ^c-d 

From the last inequality and (13) we obtain 

^ TTl \2t^r,kidx)<co. 

n k=l J k-\-\X — a„k\ 

However 
Since 



\x-aj^ = \x- aj^ + 2(x - a„^, a„^ - a„^) + \a„^ - a„, 



^«/c 5 ^nk ^nfc) 



l + |x-a„, 



we have 



sup X! 

n k=l 



'' \x-aj^ + \a„^-aj^ 



It thus follows that 

sup 



“pE Jf 



l + |x-aj 
Ix-aJ^ 



2 ^ink(dx) = 0 , 

,-n ,|2 

2 t^nk(dx)<CO. 



+ \x-aj 
The following relation is obvious 

\x-a„f 



2t^nk{dx)<CO. 



sup 

k^k„ 



l + lx-a„, 



flnkidx) = 0(l). 



(15) 



(16) 
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Therefore utilizing the inequality 
i(z,x-aj 



I 



^i{z,x-ank) _ 



1 + 



^lnk{dx) 












fink{dx) + 



+ 












we obtain the following expansion of lnx„(z): 

In Xn ( 2 ) = In E e‘ <-. ?-) = ,• (^a„, z) + 



kn r / 

+ E U'- 

k=lj \ 






+ (F„z, z) [0(sup(K„^z, z)) + o(l)] + o(l). 

k^kn 



It follows from the compactness of the measures v„ that for each £>0 an 
operator Be and operators A„eSi can be found such that 1 — Rex„(^)^ 
^ 8 for (BA„Bz, z) ^ 1. Thus — In \Xn{^)\ < for £ sufficiently small provided 
{BA^Bz, 1. Hence 

^ I (1 -cos(z, x-a„k)) n„^{dx)<2e. 
k=l J 



However in this case, 



kn 



(l -cos(z, x))/i„(dx)= ^ 



(1 - cos (z,x-aj) 



\x-ank\^ 
1 +|x — 



t^Adx)^ 



k„ r 

< Z (1 - cos(z, X - aj) /u^(dx) < 2s . 
k=l J 



We have thus shown that the measures jLi„ are compact. Hence in view of 
Corollary 2 of Theorem 2 in Section 2, operators are compact in 
and therefore, in view of Lemma 2 in Section 2, the series 

00 

1 (V„e„ e^) converges uniformly in each basis {cj,}. Clearly 

k= 1 



lim Xn{z)= lim exp<^ i(a„, z)-^{V„z, z) + 






i{z,x) 1 (z, x)^ \ 1 + |x| 



5T + 



l + |xr 2 1 + |x|V 1^1 



li„{dx)>. 
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We choose a subsequence n' such that the measures converge to a 
certain measure n' and (V„'Z, z)^(Vz, z). In this case the limit of z) 
(which is equal to (a, z)) exists. Thus 



lim x„^(z) = exp<i(a, z)~i(Vz, z) + 



H- 






i(z, x) 1 (z, x)^ 
1 + |xp 2 1 + |xp 




The weak convergence of to n' and also the convergence of {V„z, z) to 
(Fz, z) follow from the uniqueness of the representation of a characteristic 
function. Moreover, this yields the weak convergence of to a. To 
show that the strong convergence of a„ to a is also valid, we note that 
Xn{z) is uniformly convergent to x(^) for |z| c (cf. the remark for Theorem 
3 in Section 2), (F„z, z) ^ (Fz, z) also uniformly for all |z| < c (this follows 
from condition 3 of the theorem) and finally one can show in exactly 
the same manner as is done in the remark for Theorem 3 of Section 2 that 
the series 



kn 



1 



^i{z,x-ank) _ __ 



i(z,x-a„^) ^ 1 
l + |x-aj^ 2 



(z, 

l + |x-gj2. 



^^nkidx) 



converges uniformly to 




f(z, x) 1 (z, x)^ 
l + |x|"‘'’2 1 + |x|" 



n' {dx ) . 



Therefore (a„, z) also converges uniformly to (a, z) for |z|^c. The same 
argument as in Theorem 2 now yields that a„ converges (strongly) to a. 
The theorem is thus proved. □ 

Remark. If we define a„j, by means of the relation 



{a„k, z) = 



(x, z) n„u(dx). 



J 

1^1 



and the variables jll„, V„k and are defined as in Theorem 4, then under 
the conditions of Theorem 4, the distribution of the variable will 
converge weakly to an infinitely divisible distribution with the character- 
istic function 



x{z) = QxpU{a, z)-^{Bz, z) + 
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+ j* (e‘ - 1 - i (z, x)) - n {dx) + 

|x|>c 

provided only c is chosen in such a manner that 7r({x:|x| = c}) = 0. The 
proof of the necessity and sufficiency of the conditions is identical to the 
proof given in Theorem 4. 



§ 4. Limit Theorems for Continuous Random Processes 

In this section general theorems on weak convergence of measures in 
metric spaces, presented in Section 1, are applied to the derivation of 
limit theorems for random processes continuous with probability 1 . 

Let (/) be a sequence of random processes defined on the interval 
\_a, b~\ taking on values on a certain separable complete metric space 
and continuous on \a, b~\ with probability 1. Denote by f^e set 

of continuous functions x{t) defined on [a, b~\ and taking on values in 

We introduce the following metric on 

sup e(x(f),;^(/)), 

a^t^b 

where q is the distance in In this metric the space ^[a,b]{^) becomes a 
complete metric separable space. Denote by the (7-algebra of 

all Borel sets in This (x-algebra coincides with the smallest 

(T-algebra containing all cylindrical sets in (cf. Section 2 of 

Chapter V - the proof given in that section is for the case of a linear ; 
the proof in our case is the same). Therefore one can associate with each 
process (/) a measure on (^) such that the values of this measure 

on cylindrical sets coincide with the finite dimensional distributions of 
the process 

What is the meaning of the weak convergence of measures /z„ for 
random processes 

Let fi„ converge weakly to fi where /i is the measure associated with 
the process i{t). Then for each /i-almost everywhere continuous and 
bounded S[^j,j(^)-measurable functional (p{x) defined on we 

have 
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(cf. lemma of Section 1). Therefore for each /i-almost everywhere conti- 
nuous ©[a 5 j(^)-measurable functional f(x) 



lim 

n -* 00 



/• 









for all real A. 

We now note that 



here f(in{')) f Hi')) ^re random variables for each mea- 

surable functional / and the last formulas are corollaries of formula (2) 
in Section 1 of Chapter V. The convergence of the distribution of the 
variable / (^„(*)) to the distribution of the variable follows from 

the convergence of the characteristic function of the variable / (^„(*)) to 
the characteristic function of f(^(-)). Thus the weak convergence of 
measures to fi yields the convergence of the distribution of the variables 

/(^„(*)) to the distribution of /(<^(-)) for each /x-almost everywhere 
continuous ^](^)-measurable functional f{x). Conversely, if the 
distribution of / (c^„(*)) converges to the distribution of for each 

/r-almost everywhere continuous f,j(^)-measurable functional, then 
E (^ ((^„ ( • )) ^ E <p (^ ( • )) for each bounded ju-every where continuous 
®[^^t](^)-measurable functional cp, i.e. 




Therefore the weak convergence of measures to fi is equivalent to the 
convergence of the distributions of / (^„(*)) to the distribution of / (<^(-)) 
for each /.^-everywhere continuous ®f^^^j(^)-measurable functional f. 

It is common when investigating limit theorems for stochastic pro- 
cesses to assume the weak convergence of the marginal distributions, 
i.e. the convergence of the measures fi„{A) to fi{A) for all cylindrical sets 
A which are the sets of continuity for the measure ja. For each open sphere 
in of the form 

{x{'):Q{x{t) x(^))<^, a^t^b}, 

where x{t) is a given function on the following relation is valid 

{x(-):^(x(t), x{t))<s,a^t^b} = 
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c» cx) r I 

= U n ]x(-):e(x(f*),x(ffc))<£-— , k = l,...,N 

m=l N=1 I 

where • • •} is an everywhere dense sequence on [a, fc]. Consequent- 
ly, the algebra SIq open cylindrical sets which are the sets of continuity 
for the measure satisfy the conditions of Theorem 4 in Section 1. 

To apply this theorem one must find the general form of a compactum 
in the space t] (^)- ^^se when ^ is a finite-dimensional Euclidean 

space, the general form of a compactum in is given by the well- 

known Arzela’s theorem. 

An analogous result is also valid in our case. We state this result as 
the following lemma: 

Let Xs b® ^ positive monotonic and continuous function defined for 
S>0 satisfying condition as (5 jO and let be a compactum in 
Denote by K{Xi,X^) the set of functions x{t) belonging to ^[a,b]{^) 
satisfying conditions: a) x{t)eX a^t^b; b) ^(x(ri), x(t 2 ))^^ for 
IL ~ ^2! 

Lemma 1. The set K{Xi, is compact in compactum 

Ki in ^[a,b](^) ^ compactum X^ in dC and a function which 

is positive, continuous and increasing and satisfying the condition A+q — 
such that K^czK{X^, X^). 

Proof. To prove the compactness of the set K{X^, X^) we consider an 
arbitrary sequence x„(*) belonging to this set and show that a conver- 
gence subsequence can be extracted from this sequence. Utilizing the 
compactness of the set of values of x„{t) for each t we can select, using 
the diagonal method, a subsequence x„^(t) such that will converge to 
a certain limit for all rational t in [a, fo]. Denote by and show 
that the sequence y^it) is convergent. Let a^ti<...<tj^^b he rational 
points such that the length of each of the intervals [a, tj, [r^, - 

[tjv, fc] does not exceed d. Then 

sup sup Q(yk{ti), yi{ti))+ 

a^t^b l^i^N 

+sup{(^(jfc(fi), yk{t))+Q{yi{til JiW)); 

Therefore 



lim r(jk(-), 

k, l-*ao 

Since 5>0 is arbitrary, it follows that jfc(-) is a fundamental sequence 
and thus is convergent to a limit. To prove the second assertion of the 
lemma we denote by X^ the set of values x(^) with x( )eXi. We show 
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that = te\a, b} is compact. Let x^^eX^. Then where 

t 

y^{')eKi. Choosing the subsequence rij, in such a manner that 
and we observe that x^^-^y[tQ)eX^. Next, setting 

A^(x(*)) = sup{^(x(/i), x(/ 2 )); Ui — ^2l<<5}, we can easily verify that 
/l^(x( )) is jointly continuous in the variables ^>0 and x(-). Therefore 
it follows from the compactness of that the function sup{>l^(x( )); 
x{')eK^]=X^ is continuous in the variable 6. The monotonicity of 
follows from the relation Since ^(x(-)) 

approaches zero monotonically as (5 1 0, it follows from Dini’s theorem 
that the convergence is uniform on each compactum. Therefore 

lim^ = lim sup{^(x(-)); x(-)g^i} = 0. 

<5i 0 0 

Hence K^c^K(X^, and the lemma is proved. □ 

Theorem 1. Let the marginal distributions of the processes (^„(^) converge 
to the marginal distributions of the process ^{t). In order that for all func- 
tionals f continuous on distribution of /((^„(-)) converge to 

the distribution of f{^f)) it is necessary and sufficient that for every ^ > 0 
the relation 



lim sup P { sup 

h -*0 n \ti—t2^h 



Q{Uh),Ut2))>Q} = 0. 



( 1 ) 



be satisfied. 

Proof. Necessity. If the assertion of the theorem is satisfied, then the 
sequence of measures p„ associated with the processes (^„(^) is weakly 
compact so that condition b) of Theorem 1 in Section 1 is satisfied. There- 
fore for each e>0 a compactum K[X^, X^ exists such that 



sup^„(%fc](^)-/s:(Zi, /la))^£. 



Then 



supP{ sup (^„(f 2 ))>>l*}<e. 

n \ti-t2\^h 

If h is sufficiently small, then Xy^<q and 
ISnP{ sup 

Since a>0 is arbitrary, we obtain equation (1). 

Sufficiency. Taking into account the convergence of measures to g on 
cylindric open sets of continuity of the measure g (which follows from 
the convergence of the marginal distributions of ^„{t) to the marginal 
distributions of ^{t)) and Theorem 4 of Section 1 we observe that it is 
Sufficiency. Taking into account the convergence of measures g„ to g on 
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sufficient to establish the compactness of measures fi„. We denote by 
the measure on ^ which represents the distribution of ^„(t) and show 
that the set of measures {v„^, te\_a, b]} is compact. Indeed, 

if is a sequence of measures we can, by choosing a subsequence 
such that easily verify that for each bounded and continuous 

function (p{x), defined on^. 



liiti (p{x) lim = 

k^ooj k-*oo 

= lim Eq> (to)) + lim E [<p (t„ J) - cp (to))] = E<p (to)) . 

k^oo fc-^oo 

This is because for each compactum and ^>0 

iiin E |(p (4^ (t„J) - q> (to))| ^ 

k-* 00 

^2 sup|<i£)(x)| lim [P{(^„Jto)^Xi} + 

X k-^ 00 

-\-sup{\(p{x)-(p{y)\; xeX^, g(x, y)^Sj, 

and the r.h.s. may become arbitrarily small in view of the continuity of 
cp(x), the compactness of and condition (1). 

We now choose the sequence (h^j according to the following condi- 
tion 

sup P { sup e (^n (t'h («")) > 2 ■ '‘I < 2 “ . 

n \t'-t"\^hk 

/z^ 

Let X^^^ be a compactum such that — ^ for all n, 

b — a 

te[a, b~]. Denote by X^^^ the set of x such that ^(x, X^^^)^2~^. Then 
P{<^„(t)eXf,a^t^l>}^p|^„(a + //i*)€X<^>; 



Therefore 



sup Q{L{tl),L(t 2))<2 

\ti~t2\^hk J 



l-P{Ut)eX[^\a^Kb}^ X P{Ua + lhHX^’‘>} + 

hk 

+ P{ sup 0(<^„(tl),4(^2))>2-‘H2•2-^ 

|tl-l2Kfck 



00 

Note that H X^^^ is a compactum in We now construct for each e > 0 

k = m 
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a compactum K(Xi, such that — for all n. 

8 ® 

To do this we choose an m such that 2 ^ 2“^<- and set = H 

k^m ^ k=m 

Consider the sequence X^IO. For each r an can be found such that 
1 and 



£ 

supP{ sup <^„(t 2 ))>A,}< 

9r+ 1 • 

n \tl~t2\^hr ^ 



Let be a nonnegative continuous non-increasing function such that 
= Clearly, as Moreover, 



P{U-HK{X„X,)}^l-P{Ut)eX„a^t^b] + 

00 8 ^8 
+ Z p{ sup 4(f2))>l,}<-+ X :^=8. 



The theorem is thus proved. □ 

Remark 1. Condition (1) may be substituted by the following condition 
limIimP{ sup ^„(t 2 ))>£} = 0. (2) 

h-*0 n-*oo 

(The latter is often easier to verify.) Indeed it follows from (2) that for any 
^7 > 0, a 5 > 0 and an N exist such that for n > AT and for h <6 we have 

P{ sup Q(i„(ti),i„{t 2 ))>e}^ri. (3) 

Ui-til 

The uniform continuity of the processes follows from their continuity. 
We thus have for each n 

limP{ sup <^„(^2))>s}=0. 

h^O \ti-t2\^h 

Therefore a 3 can be chosen such that (or h< 3 relation (3) is satisfied for 
all n. 

The following theorem is occasionally more convenient in applica- 
tions. 

Theorem 2. Let the marginal distribution of the processes converge 
to the finite-dimensional distributions of the process ^{t) and let a>0, 
jS>0 and 7/>0 exist such that for all t^, t 2 ^[a, b~\ and for all n 

(4) 

Then for all functionals f continuous on ^[a,b] {^) distribution of /(c^„(*)) 

converges to the distribution of f{^{’)). 

Proof. We utilize Lemma 1 of Section 5 in Chapter III. Condition (4) 
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of this lemma is fulfilled for the process ^^{t) if we set g(h) = hy, where 

0<y <-, q{C, = where d = ^ — ay. Here the functions G{m) 

oc 

and Q{m, C) defined by equation (8) of Section 5 in Chapter III are given 
by 

'2^- my 2 “^^ 

= T=h-a. 

Hence in view of relation (7) of Section 5 in Chapter III the following 
inequality 

P{ sup Q{^„{ti),^„{t 2 })>e}^Le-^hi^ 

\ti -t2\^h 

is satisfied where L is a constant. The remainder of the proof follows 
from Theorem 1. □ 

Convergence of processes constructed from the sums of independent ran- 
dom variables. Let be a double sequence (a sequence of 

series) of numerical random variables, independent in each series and 
satisfying conditions 

1) E<J„j = 0, i=l,...,k„; 

2 ) y^ni = b„i, tb„, = l. 

i=l 

We construct the random function ^g[ 0, 1] as follows: set 

k k 

^nk ^ ^ni ? ^nk ^ ^ni ? 
i=l i= I 

^nk + 1 ^nk 

for S^Q = 0, t^Q = 0. Then ^^{t) is a random broken line, join- 
ing the points of the plane (t; ^), with coordinates /c = 0, 1 , . . ., /c„. 

We study the conditions under which the marginal distributions of 
processes (/) and the distribution of functionals on these processes con- 
verge to the marginal distributions of corresponding functionals of the 
Brownian motion process w{t). 

Theorem 3. Let the random variables satisfy conditions 1 ) and 2) as well 
as the Lindeberg condition: if F^fx) is the distribution function of the vari- 
able then for each ^ > 0 

lim ^ I dF„i{u) = 0. 

n-^co i=i J 

|m1>£ 



( 5 ) 
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Under these conditions the finite-dimensional distributions of the processes 
converge to finite dimensional distributions of the process w[t) and 
the distribution of /((^„( *)) converges to the distribution of /(w( •)) for each 
continuous functional f on ^[o, i]- 

Proof. The convergence of finite dimensional distributions of processes 
^n{t) to finite dimensional distributions of w{t) follows from the central 
limit theorem. To prove the convergence of distributions of /(^„(*)) to 
the distributions of /(w(')) for all functionals / continuous on ^[o,i] we 
verify that for an arbitrary a>0 condition 

limIimP{ sup |4(;i)-4(/2)l >s} =0 (6) 

h-*On-*c30 \ti — t2\^h 

is fulfilled and utilize Remark 1 following Theorem 1 . Since 

sup |i^„(ti)- 4 (f 2 )|< 2 sup sup \^„{t)-i„(kh)\^ 

\ti-t2\^h k kh<t^ik+2)h 

<4 sup sup \^„(t)-^„(kh)\, 

k kh<t^(k+l)h 

it follows that 



P{ sup |(^„(fi)-<^„(t 2 )l>e}< 

\h-t2\^h 



^ X P sup 

kh<i lkh<t^{k+i)h 4J 



Note that 



sup \^„{t)-^„{kh)\^2 sup 

kh<t^{k+l)h jn,k<r^jn,k+i 



1 Lj- 



where ^ is the maximal of the indices j such that t„j does not exceed kh. 
Since for;„ ,,<s< 7 „ 

,fc+l 



lim sup P 

00 s 



Jn, k + 1 

E 



256 , 



it follows that for h sufficiently small we have, in view of Theorem 6 of 
Section 3 in Chapter II, that 



limp] sup 

n-»oo lkh<t^(k-\-l)h 4J 



1 256 , »- 
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It follows from the convergence of finite dimensional distributions of 
i„{t) to finite dimensional distributions of w(t) that 






e du . 



Consequently, 

Iim P { sup \in{ti)-in {h)\ > e} = 

n^oo \ti.-t2\^h 



I6^h 



= 0 



I I 



e-^^/^du] = 0{— 

Kh 



e du 



\u\ > £/ 



16y/h 



\u\> ej 



16 V* 



Since 



lim - i 
h^O h 

l«l> 



e du = 0, 

E 

16 ^~h 



we thus obtain condition (6). The theorem is thus proved. 
The following result is a corollary of Theorem 3. 



□ 



Theorem 4. Let (^ 2 ? •••? ^ sequence of independent identically 

distributed random variables such that = 0, Varc^^ = 1. Denote by 

the random broken line with the vertices , where 5q =0, = 

n r U ? fc 

yjn 

= + ... Then for each functional f which is p^-almost everywhere 

defined and continuous on ^[o, i]» '^here is the measure associated with 
process w{t), the distribution of /((^„(-)) will converge to the distribution 
off{w{-)). 

Corollary. If the conditions of Theorem 4 are satisfied, then 



lim P{ max |AS'fe|<a.^} = P{ sup |>v(/)|<a} 

n-»oo 

for almost all a. 

This follows from the continuity of the function /(:^(‘))= sup |x(r)|. 

Theorem 5. Let function cp{x) be defined for and be Riemann inte- 

grable in each finite segment and let the variables satisfy the conditions 
of Theorem 4. Then 
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for all a such that 



1 




Proof. We show that functional 



1 



/W-))= 



(p{x{t)) dt 



0 



is /i^-almost everywhere continuous (in the metric of i])- Let x„{t) 

uniformly on [0, 1]. Then (p{x„{t))^(p{x{t)) for all t such that 
x(t)^A^, where is the set of discontinuities of the function cp. Denote 
by x<p{^) the indicator of the set A^. Then the functional f{xf)) will be 
continuous at point i] if f^i* almost all t, i.e. if 

1 

j* Z,.(^(s))‘^s=0, 

0 

since in this case (p{x„{t))-^(p{x(t)) for almost all t and (p{x„{t)) are 
bounded by the same constant since suplx„(t)l is finite and (p{x) is 



bounded on each finite interval. Since q> is Riemann integrable A^ has 
the Lebesgue measure 0. We observe that 



1 1 

Ex^{w{t))dt = 



~j* / ^ — dx dt=0. 



0 A<p 



1 

y/lnt 



The quantity J x<p{^{t)) dt is nonnegative, therefore 
0 



P) xM{t))dt¥^0>=0. 



0 

If we denote by the set of points of discontinuity of the 




§4. Limit Theorems for Continuous Random Processes 



415 



functional /, it follows that 



X,p(x(s)) ds>0 

0 

and hence 

z^(w(j))dt#o| = 0. 

0 

If ^„{t) is the process introduced in Theorem 4, then, in view of Theorem 4, 



limP 



(p{i„(t)) dt<(x>=P 



(p{w{t)) dt<oi>, 



provided only that 



1 

p||* (p{w{t)) dt = cc^ = 0. 



Let (p^ (x) and <p^ (x) be two continuous functions satisfying relations 
(p; (x) <(p{x)< (p^ (x) and 



W] dx<8. 

For any continuous function ^{x) 



A \/ ^ 



k/n 









dt^ 



fc-i 



where 



^„ = sup 

k 



^ sup {|(p (x) - ^ (j)| ; |x - y| < |x| ^ CJ , 

1 



nj \ n 



supl<^(il, 

k^n 
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Cn-- 



supISfc 

k^n 



Therefore 

1 

r I 

p 






n k=i 



0 






provided only that d and c are chosen such that for |x — y| |x| we 
have |<p(x) — ^(y)| <£. However 0 in probability and the probability 
P{C„>c} becomes as small as desired for all n by choosing c sufficiently 
large. Hence 

1 

'n 



1 

I' 



nk=i 



in probability so that 



1 n 

- 1 
n-*co (n/t=i 



lim P<- X — l< 



fp{j 



<^(w(t)) dt<(x 



provided only that 



(p{w{t)) dt = oc> = 0. 



Since 









then approaching the limit as n ^ oo in this relation, we obtain for each 
h>0 
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< lim p|i X (w{t))dt<oi + h 

\/ ^ A 



However, 

1 



(p^{w(t))dt- (p(w(t))dt 









1 1 

J* (Pe (w(0) dt-^ q>; (w(f)) dt^ < 

0 0 

1 ao 

- ^ \ l(pt{x)-(p7{x)']e~^'‘'^‘dx^^^. 






Thus the distribution of (w(0) converges to the distribution of 



(jp(w(t)) dt as 8->0. An analogous assertion is valid also for (p^ . Ap- 



1 

I 

0 

proaching the limit as a ^0 we observe that for all h>0 

pjj (j9(w(t))rft<a + /zj^ Im p|- X 
0 

1 

^limp|- Yj 5fc^<a|^p|J(j9(w(t))<it<a + /z|. 



Approaching the limit as /z ^ 0 and taking into account the fact that the 

function p|J (p(w{t)) dt<z^ is continuous at z = a provided only that 
0 

p|j* (p{w{t)) dt = a| = 0, we obtain the proof of the theorem. □ 
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Convergence of continuous processes with independent increments. Con- 
sider continuous processes with independent increments and values in a 
certain Banach space If ^(t), a^t^b is such a process, then for all 

e>0 

Um"z P{ia^*+i)-aa>fi}=0, (7) 

where a=t(^<tx< ... <t„ = b, X= max (4 + 1 — (cf. Theorems 1 and 4 in 

k 

Section 5 of Chapter III). 

Theorem 6. Let n = 0, 1,... be a sequence of continuous processes 
with independent increments defined on [a, Z?] and taking on values in . 
In order that for each function (p{x) continuous on %a,b] i^) distribution 
of the variable (p{^„(‘)) converge to the distribution of the variable (p{^q{')) 
it is necessary and sufficient that the following conditions be satisfied: 

1) marginal distributions of the processes converge to the marginal 

distributions of ^^{t); 

2) for each s>0 

limUm sup P{|^„(?2)-4(^i)l>e}=0- 

h-^0 n-*oo hi— 

Proof. The necessity of condition 1) follows from the convergence of the 
distribution of ^ (r , 4 (tfe)) to the distribution of ^ ((^o (^i). • • • . ^ o 

for each bounded continuous function g(xi,... x^) defined on (the 
functional (p(x(-)) = ^(x(ti),..., x(tj,)) is continuous on The ne- 

cessity of condition 2) follows from the Remark following Theorem 1 
since the following inequality is valid : 

sup P { 1 4 (fi) - (^ 2)1 > e} ^ P { sup I (^ 2 ) - (ti)l > 4 • 

\ti-t2\^h \ti-t2\^h 

In view of the Remark following Theorem 1, in order to prove the suffi- 
ciency of the conditions of the theorem it is only necessary to show that 
condition 2) yields for all £>0 equality 

limIImP{ sup \^„{ti)-^„{t 2)\>£]=0 . (8) 

h-*On^co \ti-t 2 \^h 

In the same manner as in the Remark following Theorem 1, we show 
that condition 2) yields equality 

limsup sup P{|i?„(fi)-^„(t 2 )l >«}=0 

h-*0 n \ti~t 2 \^h 

for each ^>0. Choose for a given a>0 an so small that 



sup sup p]|<^„(fi)-(?„M> 7 [ 
n |t,-t 2 K 2 /i {, 4J 



( 9 ) 
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Then utilizing the continuity of ^„{t) and Lemma 4 in Section 3 we obtain 
that 



P 



sup 

s^t^s+2h 



^ U 2P k„(s + 2/z) 



Therefore 

P{ sup I4(f)-(^„(s)|>8}< 

^p|sup|^|(^„(f)-^„(a + /c/i)|; kh 

^ X p|sup[l(J„( 0 -^„(a + /c/j)|;fcli<t-a^(fc + 2 )/i]>^[^ 

kh<b-a I 

<2 X pL„(a + /clt + 2/i)-4(a + fe/i)|>^l, 

kh<b-a I 

(if t>h we set ^„{t) = ^„{b)). In view of condition 1) of the theorem 
limP{ sup 

n-^00 \ti~t2\^h 

<2 X! p|l^o(a + (fc + 2)^)-.^o(a + ^/j)l>7|< 

kh<b-a I 4J 

<4 X p\\^^(a+(k+l)h)-U^ + kh)\>^}. 

kh<b-a L oj 

The fact that the last sum tends to zero as /i-^O follows from condition (7). 
The theorem is thus proved. □ 

Convergence of continuous Markov processes. Consider a sequence of 
continuous Markov processes n = 0, 1,... defined on the interval 
[a, b~] and taking on values in a complete metric space ( J", ^). Denote 
by P„(t, X, s. A) the transition probability for the process Let 

K{x) = {y-Q{x,y)>e}, 

a„{h, e) = sup{P„(fi, x, ( 3 , K,(x)); xe^, \ti~t 2 \^h}. 

Theorem 7. Let the marginal distributions of the processes ^„(t) converge 
to the marginal distributions of the process ^^[t) and let the following con- 
ditions be satisfied: 

1 ) for each 8>0 lim supa„(/z, e) = 0; 

w-»>0 n 

2) if a = tQ<ti< ... <tn = b, 2. = max{tk+i — tj,), then for every oQ 

k 



^t — a^ (/c + 2)/i, 0^k< 



b — a 
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lim X ‘^o(^k+l))>£} = 0- 

A -^0 ^-0 

Then for each function <p in distribution of (p{^„{‘)) will con- 

verge to the distribution of (p(^q(’)). 

As a preliminary, we prove the following lemma. 

Lemma 2. If for a separable Markov process ^{t) the quantity a(/z, a/2) 
- defined in the same manner as a„(/z, a/2) is defined for ~ Is less than 
1 , then 

P{sup[e(^(fU(s));s6[f, t + hj]>s}^ 1 - ~ a(/i e/2) ' 

Proof. Taking into account the separability of the process, it is sufficient 
to prove (10) in the case when the sup under the sign of the probability is 
taken over any finite subset / of the interval [t, t + Let I = {t = tQ,..., 
t^ = t-\-h}. Denote by Bj, the event 

Cfc = |e(<^(fk), Then 

n 

Co= U {BrCi...nBj_inBjnCj}. 

j= 1 

Therefore 

P{Co}^ E P{C,|Bin...nB,._inB,.} P{5i n ... = 

i=i 

j=i 

>(l-a(h, e/2)) E P{5i n ... nB^_i nBj}. 

j=i 

It remains to show that 

E P {B 1 n . . . n Bj _ 1 n B,} = P {sup [e ((^ (0, (s)), s e /] > s} . 

j=i 

The lemma is thus proved. □ 

Proof of Theorem 7. Choose h so small that supa„(2/z, ^/8)<^. It then 

n 

follows from Lemma 2 that 

pjsup [e(.?„(f), ^„(s)); se[t, t + 2h]]>||<2p|e(<^(f), ^{t + 2h))^^. 
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From this inequality we obtain, in the same manner as in the previous 
theorem, that 

limP{ sup 0(^„(fiU„(f2))>e}< 

n-»-oo 

<4 X p\Q{io(a + {k+l)h),^o{a-\-kh))^^ 

kh<b-a I 

(if t>b we set ^{t) = ^ (b)). The proof of the theorem then follows from 
this inequality and condition 2). 



§5. Limit Theorems for Processes without Discontinuities of the Second 
Kind 

Metrics in the space of functions without discontinuities of the second kind. 

To apply the results of Section 1 to processes without discontinuities of 
the second kind, it is necessary as a preliminary, to introduce a suitable 
metric in the space of functions without discontinuities. Denote by 
^[a,b] (^) the set of functions x{t) defined on [a, taking on values in 
a complete metric space ^ and possessing limit values x (^ + 0) for r < Z? 
and x(/ — 0) for Since any interval [a, b~\ can be mapped con- 

tinuously and in a one-to-one manner into an interval [0, 1] we shall 
consider the space ^[o, Functions which coincide at all points of 
continuity will not be distinguished, therefore it is natural to give a 
standard definition for the values of the function x(/) at the discontinuity 
points. In what follows, we shall assume that for all functions in 
the following relations are satisfied 

x(^) = x(/ + 0), x(0) = x( + 0), x(l) = x(l— 0). (1) 

The value — 0), x(r)) is called the size of the jump of x{t) at point t. 
It is necessary to introduce in (^) a metric which will transform 
a separable metric space with the property that the min- 
imal (7-algebra containing all cylindrical sets coincides with the (j-algebra 
of Borel sets in this space. It is also desirable that the metric be sufficiently 
“strong” (i.e. that there will be as few as possible converging sequences 
and hence as many functionals as possible, continuous in this metric). 
The uniform metric 

eu(^( •).>'(■))= sup Q{x{t),y{t)) 

is not suitable for these purposes since in this metric ^ (^) is not a 
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separable space. Indeed the set of functions 
( ^ t s 

= 0<S<1, 

is uncountable, but the distance between any two elements of this set 
is 8. We introduce in space i](^) ^ metric which is somewhat weaker 
than the uniform metric. 

Denote by A the totality of all continuous monotonically increasing 
numerical functions 2(t) with te[0, 1] such that A(0) = 0, >^-(l)=l (i.e. 
l{t) maps the interval [0, 1] into itself continuously and in a one-to-one 
fashion). 

Note that for all AeA the inverse functions exist which also 
belong to A. If and ^ 2 ^ A then the composite function {I 2 ) belongs 
to A as well. 

Define for each pair x{t) and y(t) in i](^) the quantity 

= sup e(x(t), sup \t-A{t)\;^eA}. (2) 

We show that determines a metric in ^[o, i](^)- To do so it is 
necessary to verify that satisfies the three axioms: a) r^{x,y)^0 
and is zero if x=y, b) r^(x, y)=rs,{y, x); c) r^(x, z)<r^(x, y)+r^(y, z) 
for all x(-), y(-) and z(-) in 

Condition a) is obvious. Condition b) follows from relation 
r^(y,x)= inf { sup x(A(t)))+ sup lt-A(t)|} = 

AeA Oit^l 0«t«l 

= inf { sup p (y (2. “ ^ (t), x (t)) + 

;ieyl 

+ sup |A“‘(t)-t|; Ae/1} = r^(x, y). 

We now discuss condition c) - the triangular inequality - in some 
detail. Let x(-), y(-) and z(-) be functions in ^[o, iiW- For each e>0 
we can find functions Ai(t) and A 2 (t) such that the following conditions 
be satisfied 

rjx,y)^ sup e(x(t), y(Ai(t)))+ sup |t-Ai(t)|-£, 
sup e(y(t), z(l 2 (t)))+ sup lt-A 2 (t)l-e. 

Then 

r^(x, z)< sup p(x(t), z(A 2 (Ai(t)))) + 

+ sup |t-A 2 (Ai(t))|< sup p(x(t),y(Ai (£))) + 
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+ sup |f-ii(f)|+ sup e(y(^iW)>z(A 2 (^W)))+ 

+ sup ^i( 0 ->^ 2 ('ll( 0 )l = 

= sup 3'('^i(0)+ sup |t-Ai(f)l + 

O^r^l 0«(^1 

+ sup Q{y{t), z(l 2 (t)))+ sup \t-l 2 (t)\, 

since if t runs through the interval [0, 1], ^i(t) runs through the same 
interval. Taking into account relation (3) we have 

(x, z) ^ (x, y) + (y, z) + 2a , 

Since a > 0 is arbitrary condition c) is verified. 

Therefore is a proper distance function on 

The following auxiliary assertions will be required for the further 
investigation of the properties of the metric 
Define for each function x(-) in ^[o, i]W 

A^{x) = sup {min (x {t% x (t)) ; q (x (t), x (t"))] ; t — + 

sup ^(x(0), x(t)) + sup{^(x(t), x(l)); 1 — (4) 

Then in view of Lemma 1, Section 4 in Chapter III we have limdc(x) = 0. 

c^O 



Lemma 1. Let x{-) be a function in ^[o, i] (^) let [a, )5]c:[0, 1]. If 
x(-) has no jumps on [a, jS] of size exceeding a, then for \t' — t”\<c with 
tj t”e\oL, jS] we have 

Q{x{t% x(^"))i^2zl^(x) + a. 

Proof We choose an arbitrary 5 g( 0, a) and a point t in \tj t”'] with the 
property that for te[tj t) 

^(x(^'), x(/))<d,(x) + ^, Q[x{t% x(T))^A,{x)-hS. 

If there is no such point, then ^(x(/'), x{t"))<A^(x)-\-3 and hence the as- 
sertion of the lemma is satisfied. If a point t exists, then, in view of the 
fact that 

min[e(x(t'), x(t)); q{x{x), x(f"))]< 2 l,(x), 
and Q{x{t'), x{i))^Ac{x)+d, we have p(x(t), x(f"))<dc(x). Therefore 
Q (x {t'), X (t")) < 0 (x {t'), X (t — 0)) + 

+ 0(x(t-O), x(t))+0(x(t), x(f"))<d^(x)+^+e + d<,(x). 
Approaching the limit as we obtain the proof of the lemma. □ 
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Denote by the countable set of points ymk^^ such that 
U Siim(ymk) = ^^ where denotes an open sphere of radius a with 

k 

the center at x. Denote by „ the collection of functions x( i](^) 

Vk fc+l\ ’ 

which are constant on each one of the intervals and taking 

In n J 

on values in Y^. 

Lemma 2. For each function x{') in ^[Q^^(^)there exists a function x*{‘) 
in ^ such that 

r^(x, x*)^^+^ + 4A2/„(x). 



j"fe k+1 



Proof There is at most one point in each one of the intervals 

\_n n 

with a jump of size exceeding 2d2/«W- Indeed, let t be such a point, 
then 

Q (x (s), X (t — 0)) = min (x (s), x (t — 0)) ; ^ (x (t — 0), x (t))] ^ 

for sej^^, T^; 

fc+ 1 



and hence 



^(x(s), x(T))^di/„(x) for SG T, 



q{x(s-0), x{s))^2Aii„{x)^2A2i„(x), s^x. 
k fc+1 



Let Tk be the point in the interval 



such that 



_n n 

Q {xk - 0), X (T;t)) > 2 A 2 i„ (x) , 

provided such a point exists in this interval. Denote by X{t) a function 



/k + 1 

in A such that m 



1 



= Tfc and t — ^ 2 (t) ^ t. (A piecewise-linear func- 
\ n J n 

tion defined by the equalities >l(0) = 0, = and A(l)= 1 is such 

a function.) Set x{t) = x{X(t)). The function x{t) has jumps of size ex- 
ceeding 2 d 2 /„(x) only at the points of the form k/n and moreover 

r^(x, x)^ sup ^(x(t), x(2(t)))+ sup 
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Vk k+l\ 

Next let x*{t) be the function equal to x{k/n) for te I k^n— I, 

\_n n / 

and let x*{l) = x (^ — Then 

r^(x,x*)^ sup Q{x{t\x*{t))^ 

:^supsup ^ xm, X - I ;-^r< . 

k L V \^// ” ^ _ 

Since the jumps of x(t) exceeding 2d2/«(x), occur only at the points 

r/c fe+l\ 

of the form k/n, there are no such jumps in the half-interval 

\_n n / 

and hence in view of Lemma 1 

e(^(^^>^(0)<2/li/„(x) + 2 zl 2 /„W for te 
We estimate di/„(x): 

^ i/n (x) = sup I min (x (t% x(t));g(x (t), x (t")] ; t - - < t' < t < t" < t + -| + 



-hsup <^(x(0), x(t)); + 



-hsup <^(x(t), x(l)); > = 



= sup <e(x(0), x(A(t))); + 



+ sup <^(x(/l(t)), x(l)); 1 > + 



-h sup < min [^(x(/l(t')), x(>l(t))); ^(x(2(t)), x(/l(r)))] ; 



1 1 

t — — 

n n 



Note that — <2(ti)<2(t2)^t2<^i +~ for ti<t 2 <ti+- so that 
n n n 
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2 

0^X(t2)-^{h)^~. Therefore 4/„(ic)^zl2/„(x). Hence, ic*)^4zl2/„(x). 
Finally we set x*(t) = y^fc, where k is the smallest index such that 

Since ^(x*(t), x*(t))^ 1/m it follows that r^(x*, x*)^ 1/m, 
r^(x, x*)^r^(x, x) + r^(x, x*) + r^(x, x*)< 

<- + 4d2/«WH — • 
n m 

The lemma is thus proved. □ 

Corollary. The space i](^) metric is a separable space. 

This follows from the fact that in view of Lemma 2, the countable 
set (J „ is everywhere dense in ^[0, i](^)- n 

m, n 

Let be a compactum in ^ and let 2^ be an increasing continuous 
function defined for (5>0 and satisfying condition ^4.0 = 0. Denote by 
(xi , 2^) the set of functions in i] W such that x(t)eX^ for te [0, 1] 

and let d^(x)^2^ for all c>0. 

Theorem 1. 1) The setK^{X^, is a compactum in ^ 10 . 1 ] (^); 2) for 
each compactum one can find a compactum X^ a ^ and an increasing 
continuous function with 2+o = 0 such that ciK^[X^, X^). 

Proof. 1) We show that 2^) possesses a finite 8-net for each £>0. 

To do this we note that for each m there exists such that 

km 

U ^l/m{ymk)^ ^ I' 

k= 1 

1 1 

We choose m and n satisfying -H — + 429/„<£. Then the set of functions 

n m 

ymkr^ where F[y^,..., yj is the set of functions taking 
on only the values yi,..., y^ is a finite £-net in the set K^{X^, Xf). Indeed, 

in view of Lemma 2, „ is a ( - + — + 4/l2/„ Vnet in K^lXi/X^) and 

’ \n m J 

moreover functions with values in the set [y^j, . . ., y^„ J form such a net. 
The set K^{X^, X^) is closed. It is easy to verify the relationship 

c + r^{x,y) iy)+^r^{x,y). 

Therefore if r^(x„, x) ^ 0, then for each a > 0 
4(3c)^ IS 4+„(x„)^ 

n-* 00 
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Hence in view of the continuity of A, It is also clear that 

lim if x„{t)eX^ for all n. 

n—* 00 

Consequently, the limit of a sequence belonging to will 

also belong to K^(Xi,A^). It remains to show that each fundamental 
sequence x„(*) belonging to K^(Xi, will be convergent. Let x„(-) be 
a sequence of functions in Kq(X^, for which as n^oo 

and m->oo (i.e. x„(*) is a fundamental sequence). It is sufficient to show 
that some subsequence possesses the limit ic(-). Therefore it can be 
assumed that the sequence x„(-) is such that Then 

there exists a sequence of functions belonging to A such that 

sup 

sup Q (X„ (f), x„ + 1 (/l„ + 1 (t))) ^ ^ . 



Set Hi(t) = Xi{t), n„{t) = A„{n„-i{t)). Since 

sup sup |l„(t)-t|<4. 



H„{t) converges to a certain non-decreasing continuous function fx{t) 
satisfying the conditions ^(0) = 0, ju(l) = 0. Next 



sup e(x„(/r„(r)), x„_i(/r„_i(f)))= sup e(x„(A„(t)), 






1 

¥ 



Therefore x„(/i„(t)) is uniformly convergent to a certain function x*(t) 
in ^[ 0 , i](^)- We investigate the connection between the functions x*(t) 
and Let fi(t) be a constant of a certain interval [a, j8]. If x*(a) = x*(j6) 
then x*(t) is also a constant on [a, j6]; if, however x* (a) x* (j^) then 
a ye [a, jS] exists such that x*(t) = x*(a) for re [a, y), and also x*(t) = x(jS), 
re[y, jS]. Indeed, otherwise points t' <f <t"' could have been found 
belonging to [a, P'] such that x*(t') 7 ^x*(t"), x*(t")#x*(r'") and then 

lim min[e(x„(/r„(r")), x„(/r„(t'"))); Q{x„{n„{t")), x„(/r„(t')))] = 

n-^oD 

= min[^(x*(r"),x*(r"')), ^(x*(f"), x*(r'))]>0, 

while n„{t')<n„{t")<n„{f") and n„{t% tend to /i(a). This 

would contradict the fact that the sequence x„(-) belongs to K^{Xi, 1^). 
Denote by x(-) the function in ^[o, i](^) defined by relations 

x(t) = x*(/r(t)), (5) 

satisfied for all t such that fi{s)>fi{t) for all se{t, 1]. Relation (5) deter- 
mines a unique function x{t)e ^[0, 
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We show that this function x(-) is the limit of a sequence x„(-). For 
this purpose we construct auxiliary functions Let be all 

the points in [0, 1] at which jc(-) possesses jumps of size exceeding 1/n. 
Denote by [a^, jSJ the maximal interval on which fi{t) takes on the value 
Tj (this interval may also be a singleton). 

Let Yi be a point in [a^, such that x*(t) = x(Tj — 0) for y^) and 

x*{t) = x{Ti) for te\ji, In particular if (Xi = yi, then x*{t) takes the 
unique value on the interval [a^, jS,]. We choose an s„ not exceeding 

1/n such that < 1/n. Let q>„{t) be the function satisfying the relations : 

We bound the sup{^(x*(t), x{(p„{t)); 1}. If t does not belong to 

any one of the intervals [a^, J then in view of lemma 1 

e(x*(f), x((j[»„(t))) = e(x(/x(0), x(^„(f)))^ (x)+- 

n 

since x(t) has no jumps between jj,{t) and q>„{t) of size exceeding 1/n. If 
y/j, then 

e(x*(r), X (<p„ (£)))< sup {e(x(r,. -0), x(s)); se[T;-e„, x,)} <^e„(x), 

because 0(x(xj—O), x(t,))>->j^^(x). Analogously we show that for 
t^bi, Pi], Q{x*it), ^(p„{t)))^A,Jx). Consequently, 

1 3 

sup Q (x* (f), X {(P„ (£))) <- + 2 (x) ^ . 

n n 



We now estimate r^(x„, x). We have 

Vs (x„, x) ^ r® (x„ ( • ), X* in~ M ■ ))) + 

+ r&b* (^^n ^ ( • )X ^ {(Pn H • )))) + x{(p„ (/I ~ ^)))) < 

^ sup e(x„(/i„(£)),x*(£))+ sup e(x*(t),x((j9„(t))) + 



+ sup \t-(p„{n„ ^(t))|< 



1 

r 



3 

+ -+ sup |/I„(£)-<P„(£)I< 
n 



1 3 1 

^ — H 

2" n 2" 



Therefore x)-^0, i.e. the sequence x„( ) converges to the func- 
tion x( ). Assertion 1) is proved. 

2) Denote by the set of values x(t) and x{t — 0) with x{-)gKi. The 
proof that U is a compactum is identical to one given in Lemma 1 

t 

of Section 4. 
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We set zl^ = sup{zl^(x); x{')eK^} Clearly is a monotonically in- 
creasing function of c. We show that limzl =0. Assume the contrary. 

c iO 

Then one can find a sequence of functions ( • ) e X ^ and a sequence c„ 0 
such that Ac^{x„)^S for some d>0. Since is compact one can assume 
that ( • ) -> Xq ( • ). However for (x, y)^s 

4W<4+e(y) + 3e. 

Therefore for each c>0 



4(xo)^^ {x„)-3rs,{x„, Xo)>d-3r^{x„, Xo), 

as long as c„ < c — (x„, Xq). Hence d^(xo) ^ <5 for all c> 0 and this contra- 

dicts the condition that limZl^(xo) = 0. Thus limzl^ = 0. Clearly a con- 
tinuous monotonic function can be constructed such that 
A+o = 0. Then K^czK{Xi, The theorem is thus proved. □ 

The basic limit theorem for processes without discontinuities of the second 
kihd 

Theorem 2. Let (t), « = 0, 1,..., be a sequence of processes 

without discontinuities of the second kind with values in 3C ; moreover let 
the marginal distributions of ^^[t] converge to the marginal distributions 
of ^0 (0- order for every functional f defined on i] (^) continuous 
in metric r^ the distribution of /(<^„(*) to converge to the distribution of 
/((^o('))’ necessary and sufficient that for all condition 

lim Uin P{J,(4(•))>^} =0. (6) 

be satisfied. c -* o n - «> 

Proof It follows from equality (6) that for all e > 0 the relation 
lim supP{4((^„{-))>£}=0 

c-*0 n 

is satisfied. 

From here in the same manner as in the proof of Theorem 1 in Sec- 
tion 4 we verify the existence of a continuous monotone function 
such that >^-+0 = 0 ^nd 

sup P {zl, (4 ( • )) < 4, 0 < c < 1 } > 1 - e/2. (7) 

n 

Utilizing the convergence of marginal distributions of the processes 
^„{t) we can show - in the same manner as in Theorem 1 of Section 4 - 
that the family of measures {v„^, n=l, 2,... ; where v„f{A) = 

= P {^n{t)e A} is compact. Hence for each k one can find a compactum 

such that 1 — 2"^^ - for all n and t. Denote by X^^^ the set of y 
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such that Then X^ = f^ is a compactum. Since 

k 

relations x x A 2 -k{x)<^ 2 -^^^P^y that x(t)eZ^^^ 

r* ^ 1 

for we have 

CO 2 ^ 

X! X p[p (— V V l~2k^ 

k=i 1 = 0 2k P 2 + I I 2 7<£. 

f k = l 1 = 0 ^ 



Thus for each £>0 a compactum K^{Xj^, 7^), is constructed such that 
for all measures /i„ associated with the random processes (^„ ( • ) on 
the inequality 

is satisfied. It now remains to apply Theorem 1 of Section 1 (as it follows 
from the remark following Theorem 1, the conditions of the theorem are 
sufficient also in an incomplete space). Sufficiency is thus established. 

To prove the necessity of condition (6) we introduce the functional 

F^(x(-))= sup ^(x(0), x(t)) sup ^(x(l), x(t)) + 

+ sup(min[^(x(t), x(s)) ^(x(t), x(u)) 

It is easy to verify that F^(x(*)) is a continuous functional on i](^). 
Therefore if the distribution f{^„(')) converges to the distribution of 
/ (^o(’)) continuous functions /, we then have for each a>0 

IHH P{F„(U-))>e}^P{FAU-)>e}. 



Now note that 



Therefore 



^a(^(-))<^c(^(-)) + 5e““ sup e(x(0), x(t)). 



lim P{4(<^„(-))<8}<lim P{Fi/,(,^„(-))> e 'fi}< 



<P{FM-))>e-^ e}^P{A^,iU-))>^e-^ s} + 

+ P {0 sup 1 e (*^0 (0). <^o (0) ^ ^ • 
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In view of Lemma 1 , Section 4, Chapter III, the finiteness of sup^ (x(0), x(/)) 

for all x(*)g^[o, i](^) that c^o(*)^^[o, i](^) with probability 

1, the right-hand side of the last inequality tends to zero as c->0. The 
theorem is thus proved. □ 

Theorem 3. Let {t\ « = 0, 1 , . . . be a sequence of random processes be- 
longing to ^[ 0 , 1 ] {^) probability 1 such that the marginal distributions 
of ^n(t) converge to the marginal distributions of ^^{t) and let ooO, p>0 
and H> 0 exist for which the inequality 

E[e(^„(fi), Uh)) QiUh), 

is satisfied with n'^0 and t^<t 2 <t^. 

Then for each continuous functional f on i](^) distribution of 
the variables /(c^„(*)) converges to the distribution of the variable /(^o(’))* 

Proof. We utilize Lemma 4 of Section 4 in Chapter III. If we set g [h) = 
where 0<y<P/oc, and q{h) = 2^^°'Hh^, where S = p — (xy, then for all pro- 
cesses (r) the conditions of this lemma are satisfied with 

'y — my y — md 



Hence for some L 



P{A,m-))>s}^Ls-<‘c-. 

To complete the proof it remains to apply Theorem 2. □ 

Limit theorems for Markov processes. Let ^„(t) be a sequence of Markov 
processes defined on [0, 1] with sample functions belonging to ^[0,1] (^) 
with probability 1. Denote by P„{t, x, s, A) the transition probabilities 
of the process ^„(t). Next let V^(x) = {y:g(x, y)>s}. 

Theorem 4. If the marginal distributions of the processes (^„(/) converge 
to the marginal distributions of ^o(0 tf 8>0 

lim Iim sup{P„(/, x, s, V^{x)); xG^,0^s — t^h} = 0 

hiO n->oo 



is satisfied then for each continuous functional f on i](^) distribu- 

tion of /(Cn(')) converges to the distribution of /(^o(*))- 

The proof of this theorem is based on the following lemma. 

Lemma 3. Let (^ 2 ? •••? be a Markov chain such that for all k<l 
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with probability 1. Then 
Proof. The event 

{sup (min [p ((^,-, ij ) ; q |,)] ; 1 < i <; < / < n} ^ 4e} 

implies one of the events n where 

^r={e(<?i, y <2e,j=l,..., r-1; e(^i, 
B,=(supe(4, 



Therefore 

P (sup (min (.J,-, <^j) ; q {i^^, ^,)] ; 1 < i <7 < / ^ > 4s} < 



n 






I P{B,|^1,..., ir]P(d(0)= 



1 

r= 1 



f 

P{B,\ U P{dco). 

Ar 



In view of Lemma 2 in Section 4 we have 






a 

1— a 



and 

t P{4} = P(supe(?i, P(e(^i, U>s}. 

r=l k 1— a 

The result follows from these 2 inequalities. The lemma is thus proved. □ 

Corollary. If { (/) is a separable Markov process for which the transition 
probability P(t, x, s. A) satisfies the inequality P{t, x, s, F£(x))^a< 1 for 
ti^t<s<t 2 , then 

P (sup {min [p (i (t'), ^ (t")) ; q ( i (f"), ^ (t'"))] ; 1 1 ^ t' < t" < t'" ^ ^ 4e} < 



We now proceed to the proof of Theorem 4. It is sufficient to show 
that for each 8 >0 



lim lim P{d^(^„(*))>8} = 0. 

C-+0 n-*oo 

We estimate this probability. Let c be a small number such that for all 
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n sufficiently large 

sup{P„(t, X, s, P^/8 

Then 

4('^«(-))< sup e((J„(0), <^„(t))+ sup p(^„(l), ^„(f))+ 



+ sup<min[^((^„(r), kc^f<t<t"^ 



^(/c + 3) c, k<->. 

c 

Therefore 

P{4(4(-))^e}^p| sup Q(i„{0). i„{t))^^\ + 
lo^t^c 4) 



+ Pj sup Q{^„{l),i„{t))^-\ + 



+ E P ^up {min [q {t% {t)) ; q ({„ (t), (J, (t"))] ; 



kc^t' <t<t" ^{k + 3) 

<7^ + 77-^^ Z pje(4(H 4(fcc + 3c))^|}>, 



where 



Since 



l-a„ (l-a„)\ 1 { 



a„ = sup(P„(t, X, s, 1^/8 0<5 — t^3c}. 



p {in (kc), {kc 4- 3c)) ^ ^ j ^ 

< P |e (4 (^c), ^ (/cc + c)) >^+ 

+ P (4 (fcc + ^^)> 4 + 2c)) > + 

+ p |e (4 (fcc + 2c), 4 (fee + 3c)) > , 




434 



Chapter VI. Limit Theorems for Random Processes 



and ^2 it follows that 

l-a„ 

P{4(^a-))^e}=^4a„j^l+3 X p|e(4(fcc), + • 

Therefore, in view of the condition of the theorem and the equality 



n^ao k<l/c I 



= J^J[QiUkcl Ukc + c)>2^^, 

which is valid for almost all a>0 it is sufficient to show that the sum in 
the r.h.s. of the last equality remains bounded as c^O. We choose h 
in such a manner that 



Po{t,x,s,V^^(x))^^ for s-t^h, 6i = 



96 ' 



It is sufficient to show that the sum 



Z_ P{Q(io(f<^c),io{kc + c))^4Ei} 

i^kc<i + h 

is bounded for any t with a given h. 

Let if Q{io{kc), ^Q{kc-\-c))^4si, and rjj^ = 0 otherwise. 

We are required to show that ^ Erj^ is uniformly bounded in t 



t^kc<t + h 



and c. 

We estimate 



P{_ S. '/k>0- 

i ^kc<i + h 

We shall consider only rjj^ with index k for which t^kc<t + h. 
Let be the event 

k^r 



Then 



P{Z'/k>0 = Z Z »?*>0}} = 

r k>r 

= Z [ P(Z Vk>^\^o{rc + c)} P(dco)^ 

r J k>r 
Ar 
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< 




Ar 



P {supe (^ 0 (kc), io (re + c)) ^ 2ei I (rc + c)} P(rf<u) < 

k>r 



A ~3 r 



Therefore for all t and c, 



Yj Y ^ 2 )^ — 4- 

i^kc<i + h l=i 

The theorem is thus proved. □ 

Processes with independent increments in a complete linear normed 
space X are particular cases of Markov processes. We thus have the 
following theorem as a corollary of Theorem 4. 

Theorem 5. Let {t\ « = 0, be a sequence processes with independent 
increments defined on [0, 1] with values and belonging to i](^) 
probability 1 . If the marginal distributions of the process (^) converge 
to the marginal distributions of (^) and if for each ^ > 0 

lim Imi sup P{|iJ„(r)-^„(5)|>£}=0, 

h-^0 n-*oo s| 

then for each continuous functional / on {^) the distribution of 
fi^ni')) converges to the distribution of /(<^o(‘))- 

Remark. It is sufficient to require in Theorems 2-5 that the functional / 
be measurable and )Uo-almost everywhere continuous where is the 
measure associated with the limiting process in 

Applications to Statistics. We shall apply the limit theorems discussed 
above to the study of the asymptotic behavior of the empiric distribution 
function used in mathematical statistics. 

Suppose that the results of a certain experiment represent a random 
variable with an unknown continuous distribution function i^(x). How 
is one to estimate the function F(x) if n results ^ 2 ? • • •» of independent 

outcomes of the experiment are available ? 

For this purpose an empirical distribution function F*(x) is utilized 
in mathematical statistics. This function is defined by relation 




where v„(x) is the number of variables which fall into the interval 
(— oo,x). It follows from Bernoulli’s theorem that F^(x) converges in 
probability to F(x). Thus the function F* (x) can be taken as an estimator 
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of F(x). Obviously we are interested in the error incurred by choosing 
this estimator. On the other hand, it is often convenient to have an 
analytic expression for the approximation of F(x). In this case the fol- 
lowing problem should be solved : can a given function 0(x) serve as an 
approximation of F(x) if the results of the experiment (^ 2 ? * 
known. In the first as well as the second case it is important to determine 
the behavior of the difference between the empirical and theoretical 
distribution functions. 

To study this difference we introduce the process 

Lemma 4. The marginal distributions of processes rj„{t) converge to the 
marginal distributions of a Gaussian process rj{t) with Erj(t) = 0 and 
Eif]{t)ri{T) = F{t) [\-F{x)~\for t<x. 

Proof. We note that 



1 n 

L [e (4— 0-^(0]’ 

sjn k = i 

where ^ (^) = 0 for t^O and v(t)=l for / < 0. Since 

Es(^,-t) = F(t), 

Ee(4-/)8(4-t) = F(/) for r<T, 

and the processes — t) — F(t) are independent for distinct values of k, 

the rest of the proof follows from Theorem 1 , Section 1 in Chapter III. □ 

Corollary. Let F~ ^ (r) be the inverse of F{t). Set 

Ut)=rjn(F-Ht)). m=n{F-^{t)). 

Then the marginal distributions of the processes converge to the 
marginal distributions of a Gaussian process ^{t) defined for te[0, 1] 
such that 

E^(/) = 0, E^(/)(^(5) = /(1-^) for 

Remark 1. The process 4(/) can be represented in the form 

Ut)— t I4rik-t)-t], 

^n k=i 

where r]j^ = F{^^ are independent variables uniformly distributed on 

[ 0 , 1 ]. 

Remark 2. Finite dimensional distributions of the process ^{t) coincide 
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with the conditional finite-dimensional distributions of a Brownian 
motion process w{t), O^r^l under the condition w(l) = 0. Since the 
conditional distributions of the process w{t) given w(l) = 0 are Gaussian, 
it is sufficient to show that 

E(h'(;) I w(l))„(D=o = 0, 

E(w(t)w(5) I w(l))„(D = o = ^(l -■?)> for 

The variable I (/) = w(/) — Ov(l) is uncorrelated with w(l). Since the 
variables ^ (t) and vr(l) have the joint Gaussian distribution the process 
I (/) is independent of w(l). Therefore 

E(e(0lvr(l))=Ef(0 = O, 

E(,j(?)^(j)| w(l))=E,^(f)|(5). 

Utilizing relation ^ (t) = w(t) — tw(l) and the previous formulas we obtain 
that 



E(w(t)/w(l)) = ^M;(l), 

E(vr(/) w(^)/w(l))= (^)-f-t5(w(l))^=min[r; 5 ] -f5 + /5'[w(l)]^ . 

Setting w(l) = 0 we establish the validity of the statement at the beginning 
of Remark 2. □ 

Theorem 6. For any functional f continuous on ^[0, the distribution 

of f {^^{‘)) converges to the distribution of /(^(-)). 

Proof We first note that the separable process ^{t) is continuous so that 
^{t) belongs to probability 1. Indeed ^{t + h) — ^{t) has 

a Gaussian distribution and moreover 

Therefore in view of Theorem 7, Section 5, Chapter III, the process ^[t) 
is continuous. The convergence of the marginal distributions of the 
processes [t] to the marginal distributions of ^ (^) is thus established. □ 
In view of Theorem 2 it is sufficient to show that relation (6) is fulfilled. 
Since 

Afx) ^ sup {|x(C) - x(r")| ; - F\ ^ c ) , 

the theorem is proved if for all 8>0 relation 

limEmPl sup |^„(f')-(^„(;")|>£}=0 (8) 

C-»0 M-+00 |t — f"|^C 

is verified. The process monotonically increasing, hence 

for t 1 2 "^ t t^ 

- (<4 - ^ l) < (^3) - (^2) ^ (4) - (? l) + nA {^4 - ^ 1 )• 
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Therefore 



sup sup 

|r'-r"|<c 2 |ki-k2l<c2"» + 2 






Let m„ be chosen in such a manner that > 0 as n -> oo and n2 ^ 1. 

2mn 

To prove (8) it is sufficient to show that for each a>0 

lim Imi P{ sup \Uki2-^-)-Uf^22-^^)\>s}=0. 

c-*On-^co \ki—k2\^c2^n 



Note that 

irin 

sup |^„(/ci 2 “'”")-(^„(fc 22“'"")|<2 sup 

|fci -jk2|^c2»^n r = m(^) i 



^ 2r ) 



where is the smallest integer satisfying relation ^ 1 (see the proof 
of Lemma 3, Section 4 in Chapter II concerning the last inequality). 
Choose an a < 1 such that > 1. Then 

P { sup >s}^ 

\ki-k2\^c2^n 



^ Z P^sup 

r = m(^) (, i 

mn 2^-1 

^ Z I P 

r = m(^'^ 1 = 0 
mn 2^-1 

< Z Z E 

r = m(‘^) i = 0 



2" ) ^-( 2 ' 



^ a 
>- 






< 



2 1-a 

^ 2 ( 1 — 

^1^ 2(1 -g) 



sa 



r-J]- (9) 



Let be the number of variables rji taking values in the interval \_t, t + h]. 
Then 

and 



^(t + h)-^{t) = yjny- — h j. 

Calculations show (cf B.V. Gnedenko [37], Chapter 6, Section 34, equa- 
tion 9) that 

E{Ut + h)-Ut)T<^h^ + -^^h^ + h2-’"’'. 

n 

Hence we have for 

E{Ut + h)-Ut)T^4h\ 
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Substituting this bound into inequality (9) we obtain 



P{ sup 

1^1 —k2\ 



m, 







g4^4(r-m,) 



where 



U = 






Z (2a")- 

r = 0 



The theorem is thus proved. □ 
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Absolute Continuity of Measures Associated with 
Random Processes 



§1. General Theorems on Absolute Continuity 



First we shall review certain definitions in measure theory. 

Let two measures fii and fi2 be defined on a measurable space 
(^, ©). The measure 1^2 is called absolutely continuous with respect to 
measure (denoted by if foi’ ^gS such that 

(A) = 0 .lf p^« P2 P2 Ml we write ^ P2 and call these measures 
equivalent. Measures pi and P2 are called mutually singular if there exists 
a set A such that p^ {A) = O2 and P2{^ — A) = 0 . Mutually singular measures 
are also called orthogonal and are denoted by pi±p2. If measures p^ 
and P2 are finite then M2 = + ^2 where «p^ and V2^Pi> This repre- 

sentation is unique. Measures Vj and V2 are called respectively the absolute 
continuous and singular components of measures P2 with respect to 
measure p^. 

For finite measures the Radon- Nikodym theorem is valid: P2«Pi iff 
there exist a ^-measurable function q (jc) such that for all Ae^ the equality 



/^2(^) = 



I 



e(x) ni{dx) 



is satisfied. ^ 

The function p(x) which is unique up to equivalence in measure p^ is 
called the density or the derivative of measure P2 with respect to measure 



dp 2 

jUi and is denoted by ^(x) = (x). If p2 is not absolutely continuous with 

«Mi 



dp2 

respect to Pi then denotes the derivative of an absolutely continuous 

^Mi 

component of measure p2 with respect to p^. In particular if p^ ± P2 then 



dni 



= 0 . 



the term -continuous is also used. Translator’s Remark. 
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In this chapter we shall consider the case where and fi 2 are prob- 
abilistic measures, i.e. If ^ is a functional space then ® denotes 

a <T-algebra generated by cylindrical sets so that the measure /x, can be 
viewed as a measure associated with a certain random process. The sub- 
ject of this chapter is the study of conditions of absolute continuity, 
equivalence and singularity of such a measure as well as the evaluation 
of the density of one measure with respect to another. 

When proving theorems on absolute continuity of probabilistic 
measures on a measurable space (^, ®) the following procedure is very 
often used. Let ®„ be an increasing sequence of cx-algebras such that 
O' { U ®„} = ® and /i” be the restriction of measure fii on ®„. It is assumed 



n 

that the structure of the (7-algebras in ®„ is such that the absolute con- 
tinuity of the measure with respect to measure can be easily verified. 

If ^ is a functional space then ®„ usually signify (7-algebras generated by 
the cylindrical sets with bases in a fixed finite-dimensional subspace of 
Let /X2«/^i arid 



( \ \ 



The variables ^„(x) form a martingale on the probability space (^, ®, jUi). 
Indeed for each ®„-measurable function / (x) 



I fi^)Q n+l {x)fli{dx) = ^ f{x)Q 

n+l {x)fil^^{dx) = 

= 1 /(x) = | f{x)nl{dx) = 



= /(x) Q„(x) fi"i (dx)= f (x) e„(x) Hi {dx). 



From here, in view of the definition of the conditional mathematical 
expectation on the probability space (^, ®, we have 

E(e„+i(x) I »„) = 0„(x). 

However, the variable ^„(x) is ®„-measurable and hence ^„(x) is a martin- 
gale. Since and ^ q„{x) it follows from the 

theorem on the limit of martingales (cf Theorem 1, Section 2, Chapter II) 
that the limit 

lim^„(x) = ^(x) (1) 

«-► c» 

exists -almost everywhere. 
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Theorem 1. The function ^(x) defined by relation (\) is a density of an 
absolutely continuous component of the measure p 2 respect to i.e. 



q{x)=~(x). 

dp. I 



Proof Let p 2 = ap' -\-bp'\ where a + b=l, and p'«Pi, p"-Lpi so that 
p' and p" are probability measures. Denote by /x'" and /x"” the contrac- 
tions of these measures on Then 

e„(x) = ae;(x) + fee;(jc), 

where 



/ X /X // / X ” / X 

To prove the theorem it is sufficient to show that Qn (x) ^ 0 and ^^(x) 

^(x) /Xi-almost everywhere. For each ©^-measurable bounded 
dpi 

function / (x) the equality 

I* f (x) ^{x) dni{x)= I* f{x)n' 2 {dx)= [ f(x)n"‘{dx) = 

J dp^ J J 

= |* / (x) ^^(x) Pi {dx) is satisfied. 



Therefore 

where the conditional mathematical expectation is taken over the proba- 
bility space (^, ©, pi). In view of Theorem 4, Section 2, Chapter II, for 
each monotonically increasing sequence of cr-algebras ©„ we have with 
probability 1 

n->oo n 

Hence 



lime;(x)=E 

n->oo 



f \ l 

dpii. 







We now show that ^"(x)^0 almost everywhere in measure p^. Let 
lim ^"(x) = ^"(x) (the existence of the limit follows from the fact that 

n~* 00 
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Qn{x) is a martingale). For each ®„-measurable nonnegative function 
/ (x) we have, in view of Fatou’s lemma 

I / (^) ti" (dx ) =1 / (x) n"” {dx ) =1 / (x) q" (x) Hi (dx ) = 

= lim f / (x) ei' (x) Hi (dx) > j / (x) q" (x) Hi (dx ) , 

n^ao J J 

(where n>m). Thus for J fii{dx)^fi"{A). Let A be such that 

A 

fi"{A) = 0, fi^{A)=l. Then J ^"(x) /Xi(rfx) = 0. Hence ^"(x) = 0 /i^-almost 

A 

everywhere on the set A and since fii{A)= 1 it follows that g"(x) = 0 fii- 
almost everywhere. The theorem is thus proved. □ 



Corollary 1. If q{x) defined by relation (1) /.y positive p ^-almost everywhere 
then Pi«P 2 




1 



^(x)’ 

0 , 



xgS, 

x^S, 



where 5gS is such that P 2 {S) =0, pi{S)=l. 



Indeed, for each S-measurable nonnegative function / (x) the equality 



^fix)H2(dx) 

s 



I 



f(x)Qix)Hi(dx) 



is valid. 

Taking ^(x)/^(x) in place of / (x) we obtain 



e c 

g{x)Hi(dx) = 

J •- 



g(x) 

q{x) 



Hiidx), 



which yields our assertion. 



Corollary 2. In order that measure P 2 be absolutely continuous with respect 
to Pi it is necessary and sufficient that the function ^(x) defined by relation 
(1) satisfy condition 



q(x) Hi(dx)= I . 



( 2 ) 
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Since 

Qn(x) = ( 3 ) 

t. 

in order that (2) be satisfied it is necessary and sufficient that the limit 
transition under the sign of the integral be permissible in (3), i.e. that the 
functions be integrable with respect to measure uniformly in n. 

Sometimes in place of measures jj!1 which are contractions of measures 
fii on it seems more convenient to consider approximations of measures 

fii by means of certain measures ju” for which the evaluation of diiydix\ 
is simpler. This state of affairs is analogous to a certain extent to the situa- 
tion considered in Theorem 1 and its corollaries. 

Theorem 2. Let two sequences of probability measures pi and pi be defined 
on (fiC , ®) satisfying conditions: 

\) on a certain algebra ®o ^hose o-closure coincides with ® the 
sequence p\ converges to p^: lim pl^{A) = Pq{A) for Ae^q. 

n-*^co 

2) measures pi are absolutely continuous with respect to pi for n^\ \ 

3) functions Q^{x) = dpHdpl{x) are uniformly integrable with respect 
to pI, i.e. for each an N can be found such that for all n, 

I Qn (^) X\N, 00) M) (^^) < e > 

where X[n, oo)(0 indicator of the interval [A^, oo). Then pl«pl. 
Proof. For any ^e®o we have 

fil (A) = lim (A) = lim e„(v:) (dx) < N lim fij, {A) + 

n-^ao n-^ooj /i-^oo 

A 

+ lim e„(x) XiN,ao)(eAx)) Mn(dx)^Nfio(A) + s, 

n^co J 
A 

provided N and s are chosen in such a manner that the inequality stated 
in condition 3) is satisfied. The class of sets A such that 

nl{A)^Nfil(A) + s (4) 

is a monotone class containing algebra ©q? hence (4) is satisfied for all 
^e®. It follows from (4) that pl{A) = 0 provided only that Pq{A) = 0 
since the e>0 can be chosen arbitrarily small. 

Remark. In order that condition 3) of Theorem 2 be satisfied, it is sufficient 
that one of the following conditions be fulfilled : 
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1) for some a>l 

sup I 

2) a positive continuous function (p{t) exists such that 

lim — ^ = 0 and sup (p {q„ (x)) (dx) < oo , 

t^oo (p(t) n j 

3) sup| log0„(x)/ij,^>(rfx) = sup| Q„{x)logQ„(x) tij,(dx)<co, 

4) measures are also absolutely continuous with respect to ix^ 
and for every s>0 an AT can be chosen such that 



dni 



1 



lim/r„^^:^(x)<-^<a. 



( 5 ) 



Indeed 1) and 3) are particular cases of 2) if we set (p{t) = f and 
q>{t) = t\ogt-{-l. The sufficiency of condition 2) follows from inequality 



I 



^«WZ[iv,cx))(^nW)7^„M^^)^sup— (p{q„(x)) fil{dx). 

t^Ncpyt), 



To prove the sufficiency of condition 4) we note that 



W XciV, oo)fe W) ^^n {dx) = H^{ 



is positive /z^-almost everywhere in view of the fact that we have 

for all n 



dfii 



1 



From here and utilizing (5) we derive condition 3). 

dfX^ 

Theorem 2 does not enable us to calculate — 4 W- Since the functions 

d^l 

^„(x) are defined, each one with respect to its own measure, it is meaning- 
less to speak of a limit of ^„(x) as n ->oo. However, in the particular case 
when all the measures iil coincide with /zj it makes sense to consider the 
limit of ^„(x) with respect to measure /xj. If this limit exists, then provided 
the conditions of Theorem 2 are satisfied this limit coincides with the 
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derivative dfil/dfil. We now prove a more general theorem concerning 



the density 



dul 

dul 



(x). 



Theorem 3. Let the random variables /= 1, 2, « = 0, \,... be defined on 
the probability space {Q, S, P} with values in a measurable space (^, 95) 
and let algebra ®o whose a-closure coincides with ® such that for all 
y4e®o 

Xa{O^Xa{^o) 

in probability P. If the measures p'„{A) = P A) on (^, ®) satisfy con- 
ditions 2) and 3) of Theorem 2 and if 



exist in the sense of convergence in probability, then 



dill 



Proof For all ^g®q the limit in probability 

iimQn{in)XA{il) = eXA{io) 

n~* 00 

exists. In view of condition 3) of Theorem 2 the functions XAi^n) 
are uniformly integrable in measure pi and therefore the limiting transi- 
tion under the sign of the mathematical expectation is permissible: 



t4(A)= lim ExA{in)= lim Ex^iC) Qniin)= ^XaHo)q= q(x) nh(dx). 

n-^oo n-^ao J 

A 

These relations can be extended in an obvious manner over the whole 
space A. The theorem is thus proved. □ 



dpi V j 

Remark L If the variables — r {fl) converge in probability to a certain 

dpi 



limit ^ > 0, condition 3) of Theorem 2 is automatically satisfied since in 
this case condition 4) of the Remark for Theorem 2 is valid. 



Remark 2. If we do not require condition 3) in Theorem 2, but instead, 
it is assumed that E^= 1, then the assertion of Theorem 3 remains valid. 
Indeed, it follows from Fatou’s lemma that 



lim XAiin)= lim jUn(A) = fii(A) 

n-*oo n-^oo 
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for all ^6 So- Moreover, the collection of sets A for which relation 



( 6 ) 

is satisfied forms a monotone class. Therefore (6) is satisfied for all 
Ae^. If for some A the inequality 

is valid then 

EQ=EQxAQ+^QX,^-A)(a)<I^UA) + fiU^-A)=l, 



which contradicts the assumption that E^=l. Hence Eqxa{^o) = f^o(^) 

for all A or q=-^ The case when the probability space coin- 
diil 



cides with (^, ©, ju) and the random variables are defined as measurable 
mappings of ^ into ^ is of special interest. It follows from Theorem 3 that : 



CoroUary 1. Let: 1) two sequences of measurable mappings Tf{x) and 
Tf (x) of into ^ be given and let the measures p\ be defined by the equali- 
ties p\[A) = p[T\~^{A)), where Tlf^{A) if the total inverse image of A 
under the mapping 

2) TIj(x)-^ Tq{x) p-almost for all x 

3) pI ^ pI and p-almost for all x, the nonnegative limits 



lim ^(TAx)) = Qi(x), 

n-*ao dpJ^ 

lim ^ [Tl (x)) = Q 2 (x) exist . 



Then pl'^pl and 

^(TAx)) = Qi{x), ^(r^(x)) = 02(x). 

dpo dpo 

Indeed the conditions of Theorem 3 and Remark 1 will be fulfilled in 
this case if we choose for algebra ®o the continuity sets * of the measure 
Pq + pI (measures p"„ are weakly convergent to Uq). 

Corollary 2. Let conditions 1) and 2) of corollary 1 be satisfied and more- 
over, let 3) pI = Pq = v, 4) p^«p, p«Pn tind the following non-zero limits 



* Cf. Chapter VI, Section 1 . 
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in measure 



lim~(x)=Qi(x), 

n^oo Ctfi 



lim^{T„^{x)) = Q2{x) 

n^co djLln 



exist. 5) ^-almost for all x the equality 



0i(r’oW)02W=i 



be satisfied. Then 






V and (v) = 



dv 

dll 






02W = ~(7’oh^))- 



Indeed it follows from Theorem 3 and Remark 1 that ii^v. Next, as 
was established in Remark 2, 

e2(x)^^(Ti(x)). 



Hence 

1 = ei (To (^)) S2 (x) < ^ (To (x))^ (T^o (-x)) = 1 , 

which implies that ^ 2 W = ~^(7(/W) /^-almost everywhere and 

dv 

. dv . dv 

Qi{Tq (x))= — (Tq^ (x)) ju-almost everywhere. But then (x)== — (x) v-al- 
a/r dll 

most everywhere and since v ^ /r the latter holds also //-almost everywhere. 
The assertion is thus proved. □ 

We now study the absolute continuity of measures in the case when 
the corresponding spaces are mapped. Let (^i, and {5^2, ® 2 ) be two 
measurable spaces. The mapping ip of into ^2 is called measurable 
if for all Aef&2‘ Assume that two measures ii^ and are 

defined on and let the measures 112 and V 2 be defined on ©2 by the 
equalities yUj (^) = Mi (<?> “ M^))> Vj (^) = v i (<? “ ' (^)). 

Theorem 4. If v^«ii^ then V2«ii2> ^nd moreover 



|^(<,-(x))=e(|1| 



where is the a-algebra of the sets of the form cp ^{A), -4g©2 the 
conditional mathematical expectation is taken over the probability space 

(^ 1 , © 1 , //i). 

Proof. Every © 2 -measurable function /(x) can be represented in the 
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form g{(p ^ (x)), where g is -measurable. Therefore 
f{x) V2(dx) = ^ g((p-^(x)) V2(tix) = | g{x) v^(dx) = 



1 (* / J 

9 (^) 1 {dx) = 0 (x) E ( ^ (x) 



©1 )/ii(dx). 



g((p ^ (x)) is ©2-measurable. Hence 



= Since ^(x) is ©^ -measurable, it follows that 
-measurable. Hence 

g (x) g(x)gi (dx) ={ g((p (x)) Q{(p-^ (x)) /I2 (dx) = 



= jf{x)Q{(p Hx))92(dx). 

The proof of the theorem follows from the last equality. □ 



§2. Admissible Shifts in Hilbert Spaces 

As we shall see in the next section, when studying absolute continuity of 
measures under various transformations, the absolute continuity and 
density of measures under the simplest transformations - the translation - 
play an important role. Let ju be a measure in (^, ©), where dC is the Hil- 
bert space and © be the d-algebra of Borel sets in this space. We introduce 
the translation operator = x + a. Denote by the measure defined by 

relation ju^(A) = ju(S_^A). Note that if ju is the distribution of a random 
element ^ with values in then ju^ is the distribution of the random ele- 
ment ^-ha. The measure ju^ is uniquely determined by the relation 

j* /W M«(dx) = jf(x + a) fi(dx) 

valid for all those measurable functions for which the integral on the 
right exists. 

We say that a is an admissible shift* of measure jll if The set of 

admissible shifts of a measure is denoted by or simply by M if there 
is no ambiguity concerning the measure under consideration. If aeM^, 
we denote 7 

Q (a, x)=^Q (a, x) = ~p (x ) . 



also called the admissible mean value for /z. Translator ’s Remark. 
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In this section the structure of the set and the properties of the 
density x) are investigated. In what follows probability measures 
on are considered. 



Theorem 1. The set is an additive semi-group, i.e. a-\-beM^, provided 
a 6 and b e ; moreover, 

Q(a-\-b, x) = Q[a, x) Q(b,x — a). 

Proof: We have 



/ W + b {dx) =jf(x + a + b)ti (dx) = 

= ^ f(x + a) Q(b,x) (dx) = j* / (x) e (i>, X - a) {dx) = 
= j* f{x) Q(b, x-a) g{a, x) fi{dx). 



The proof follows from the fact that these equalities are valid for any 
bounded measurable function f{x). □ 

The following theorem shows that there are not that many admissible 
shifts. In particular, it follows from the theorem that in any infinite 
dimensional subspace of ^ the set is of the first category. 



Theorem 2. Let (p(z) be the characteristic functional of measure p and a 
completely continuous nonnegative symmetric operator B be such that 
(p{z)-^l as {Bz,z)-^0. Then for each aeM^.be^ exists such that 
a = B^^^b, i.e. 



Proof Let aeM and q{q, x ) = -^ (x). Then the characteristic functional 

dp 



of measure p^ can be represented as 



<?>«(z) = 



Moreover 



Pa (dx) = J e^^^ ^ ^ V (dx) = (p (z) . 

9a (z) = I Q (a, x) n (dx) . 



Hence 



(z) - 1 = I - 1 ) 0 (a, x) /i (dx) 
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We show that (p^(z)-^l for (Bz, z)->0. Since the following inequality 
|l-i/r(z)|^<2 Re(l-i/'(z)) 

is valid for any characteristic functional ^(z), it is sufficient to show that 
Re J (1 — g{a, x) fi{dx)-^0 

as {Bz,z)^0. We set ^^(x) = ^(a, x) for Q{a,x)^N, and ^^(x)=0 for 
g{a, x)>N. Then 

— COs(z, x)) ^A^(x) fi{dx)-\- 

+ j* (1 -cos(z, x)) lg{a, x)-^^(x)] fi{dx)^ 
:N Re(l— <p(z)) + 2 J [^(a, x) — ^^v(x)] /i(dx). 






The second summand becomes arbitrarily small for all z if N is chosen 
sufficiently large while the first summand approaches zero as (5z, z)^0 
for any N. We thus proved that <j9(z) as (5z, z)-^0. Hence 

as (J?z, z)->0 and therefore (a, z)->>0 as {Bz, z)^0. Let |(a, z)\<e 
provided (J5z, z) < <5. Then for ail z the inequality 

\{a, z)\^ <j (Bz, z) 



is satisfied. 

We note that a belongs to the closure of the range of values of the 
operator B since for all y such that By=0, we also have (a, y) = 0. Let 
be the eigenvalues and ej, be the corresponding eigenvectors of the 

” (^9 ^k) 

operator B. Putting C=—,z— ^ ' e^. we have 

u k=i H 

\k=l / k=l ^k 

hence ^ — < C. Approaching the limit as oo we verify the exis- 

k=i K (a e) 

tence of the vector h= ^ ^ ej,, which satisfies the relation B^^^b = a. 

The theorem is thus proved. □ 
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We now investigate the transformations of the set and the func- 
tion Q {a, x) under the simplest transformations of the measure ji. 

Theorem 3. 1) If v = then for any ceX, M^ = M^, 

Q,{a, x) = Q^{a, x-c). 
dv 

2) If v«fi,f{x)=-j-{x) and aeM^, then aeM^if and only if the ex- 
pression 



QAa, x)= - Q^(a, x) 



( 1 ) 



is defined p-almost everywhere, i.e. if 

fj.{{x :f(x) = 0}-{x:f{x-a)Qi, {a, x) = 0}) = 0 

( we define ^ = Moreover qfia, x) is determined by formula (1). 

3) If v(A) = p(L~^A), where L is an invertible linear operator, then 
My = LM^ , [a, x) = {L~ ^a, L~^x). 

Proof 1) follows from the equality 
g{x)vAdx) = ^ g{x + a + c) n{dx)= 

= j* g(x + c) Q^a, J g{x) Mdx) = 

= [ x-c)v(dx). 



.f(x-a) 



2) Let g{x) be a bounded measurable function. If q A x, a) 

is defined /i-almost everywhere, we than have 

J gf(x) V„(dx)=j* gf(x+a) v(dx)=j' gf(x + a)/(x) ^(dx) = 

= J g{x)f{x-a)fiAdx)=^ g{x) gAa, x) f (x-a) g(dx) = 

=1 qA^’ 

If, however aeM^, then for every bounded measurable function g{x) the 
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relation 



gf (x) gv {a, x) f (x) n (dx) = J 0 (x) (a, x) f(x- a) (dx) 



is valid. It thus follows that Q^{a, x) f {x) = Q^{a, x) f{x — a) //-almost 
everywhere. Assertion 2) is proved. 

3) The following is valid for a bounded measurable function g{x) 
and for a = Lb, with beM^: 




g{x-\-a) v(rfx) = J 



g{Lx-i-a) g(dx) = 



= g{L{x + b)) g{dx)= g{Lx) Q^{b, x) g{dx) = 



j 



g(x)Q^(b,L ^x)v{dx). 



It thus follows that and x) = Q^{b, L~^x) for aeLM^. 

Utilizing the fact that the operator L is invertable we obtain that a 
czLM^. Hence M^ = LM^. The theorem is thus proved. □ 



Remark. 1) It follows from 1) that if v is the distribution of the variable 
+ where ^ and g are independent random variables with values in ^ 
and fi is the distribution of the variable then and for aeM^ 

the equality 



Q^{a, x)=E{Q^{a, + 



holds. 

This equality follows from the relations 



I gf(x:) v„(4x) = | , 



gf(x) v„(4x) = J gf(x + a) v(dx) = 

= (*[ g(x + y + a) fi{dx)P{riedy} = 






and 



= 1 I g{x-\-y)Q^(a,x) g(dx)P{r]edy} = 

= Eg(i + g) {a, <J) =Eg{^ + ri)E (q^ {a, + n). 

2) If under condition 3) L is a non-invertable operator, then =) LM^ 



Q,(a, x)= E(g^(b, <^)/L<^)i^=^, 
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where b is an arbitrary vector satisfying the relation Lb = a. This formula 
follows from Theorem 4 of Section 1. 

Let \a\ = l. We study the conditions under which for all 

>i>0. Let F{t) = fi{{x:{a,x)<t}); F{t) is the distribution function of a 
random variable (a, x) on the probability space (^, ®, fi). The distribu- 
tion function F{t — X) of variable {a, x) + X = {a, x-h^a) on the same 
probability space is absolutely continuous, for all A>0, with respect 
to the distribution of (a, x). The following lemma describes the available 
information concerning F{t) under these conditions. 

Lemma. If the measure V;^(E) defined on the Bor el sets of the line by 
the relation 



v.(£) 




is absolutely continuous with respect to measure Vq (E) for all X>0, then 

dF 

F{t) is absolutely continuous and its derivative (^) 

some ti (possibly equal to —oo) p{t) = 0 almost for all t<t^ and p[t)>Q 
almost for all t>t^. 



Proof We represent F{t) in the form F{t) = F^{t)-\-F 2 (t) where F^{t) is 
the absolutely continuous and F 2 (t) is the singular component of F. Let 



vi(£)= j dF,(t-X). 

Since v|«vJ + Vo, v|±vj, it follows that v^«Vq. Therefore the measure 

00 

v(£) = | vj(E)e-^dl 
0 

will be absolutely continuous with respect to measure Vq. On the other 
hand 



HE) 



00 

0 teE 



dF(t-X)e-UX 






dF 2 (w) ^ dt^ mes E , 



i-t<0 

teE 



where mes E is the Lebesgue measure of the set E. Since v is singular with 
respect to the Lebesgue measure it follows that v = 0 and hence ^ 2 (^ = 0 . 
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Now let 




F(t-X) 



dX and v(£) = 




0 



00 

dv 1 f* 

It is easy to verify that — (t)=— — p{t — X\e~^dX, and moreover, the 
dv p(t) J 

0 

numerator of this fraction must vanish for almost all t on the set of 
arguments s such that p{s)=0. If the set {t:p{t)=0} is of a positive 
Lebesgue measure, then one can find s such that the Lebesgue measure 
of the set {t:p{t) = 0}n{s — d, s} is positive for all <5>0. Then the function 




dX 



0 



vanishes for some te{s — d,s) for any 5>0 and since this function is 

00 

continuous it follows that J p{s — X)e~^dX = 0 and hence p{t) = 0 for al- 

0 

most all t<s. If p{t) is not identically zero, then a maximal s with the 
above properties can be found. Thus p{t)=0 for almost all t<s and 
p{t)>0 for almost all t>s. The lemma is proved. □ 

Let denote the hyperplane {a,x) = t. Define measures on the 
Borel sets of T, by means of the equalities 

n' {A) = n\Ar^ r,) , n{A) = ^ fi* (A) dF(t) . 

Thus p! {A) is the conditional distribution of x under the condition (a, x) = t 
on the probability space (^, S, p). We introduce the conditional distri- 
butions of the projection of x on Fq given {a, x) = t y{A) = p^{StaA). 
Finally let v be the unconditional distribution of the projection of x on 

v(^)=J v'{A)p(t)dt. 

We introduce measure p* by means of the equalities 

v{S^^[AnFj) p(t)dt. 



( 2 ) 
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We note that jll* is the distribution of a random variable such that the 
variable (a, and the projection of ^ on Fq are independent and their 
distributions coincide with the distributions of the variable (a, x) and 
the projection of x on Fq in the probability space (^, ju). We show that 
the measure ^ is absolutely continuous with respect to //*. Note that 



t^xa(A)=n{S 




v‘(S-.a[S_.,,Anr,-]p(t)dt = 

+ + P{t) dt = 



\S^„,\_Anr,'])p{t-X)dt. 



(3) 



Consequently for any bounded integrable function k{X) the representation 



is valid. It follows from the definition of the measure jj,* that 



jU* 




v*(S_f«[^nrj) p{s) p{t) dt ds. 



If /i*(^) = 0,then v^(5_(^[y4nrj =0 for almost all t>ti and s> - in the 
Lebesgue measure - where is defined in the lemma. We may assume 
without loss of generality that v^(^)=0 for t<t^. Moreover it is natural 
to consider only those sets A which belong for some ^>0 to the set 
{x:(u, x)>ti+^} since 

p*{{x:{a, x)^ti}) = p({x:{a, x)^ti}) = 0. 

Under these assumptions the condition p*{A) = 0 implies the equality 
0 0 

J Pxai^) dX ^ p{t-X)v^~^{S.,^lAnFj)dt = 0, 

-d -d 



and hence Pxa{^) almost all Ag [ — (5, 0]. Our assertion is proved since 
p « pxa hence p [A) = 0. □ 

We note that for all A>0. Indeed from the definition 




[Ar^Fj) dt^ 



J pW 



v{S_tal^nFj)p{t) dt. 
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Hence 



dfifa . ' p((g, x)-A) 
dii* ^ p{(a,x)) 

Therefore utilizing Theorem 3 part 2) we obtain the following 



Theorem 4. In order that XaeM^for all A > 0, it is necessary and sufficient 
that the following conditions be satisfied: 

1) the function F{t) = g{{x\[a, x)<t]) be absolutely continuous and 
there exist t^ (possibly equal to —oo) such that p{t), the derivative of 
F{t), satisfies p{t) = ^ for almost all t<t^ and p{t)>Q for almost all t>t^. 

2) the measure g be absolutely continuous with respect to measure p* 
defined by 



g*{{x:a<{a, x)<jS} n {x:PxeA}) = 

= p{{x:a<{a, x)<^}) p{{x:PxeA}), 

where P is the projector on the subspace Fq and its density W 

is such that the expression 



Q^{X, a, x) = 



p ({a, x) — X)q(x— la) 
P{{a, x)) q{x) 



(4) 



is defined p*-almost everywhere provided we assume that the expression 
is zero as long as the numerator is zero. 



Remark. If XaeM^ for all real 2, then p(t)>0 for almost all t and the 
measures /i and /t* are equivalent. Indeed /z* is equivalent to the measure 

J k{X) provided k{X) is a positive integrable function and the 
measure f k{X) is absolutely continuous with respect to p. 



Corollary. If a^, ..., an{\ak\ — l) are mutually orthogonal vectors such that 
Xaj,eM^ for all real X and A: then: 1) the functions F^{t) = 

= p{{x : [a^, x) < ^)}) are absolutely continuous and their derivatives Pi,{t) = 

positive for almost all t ; 2) the measure p is equivalent to the 
measure p defined by the relation 



n{x:P„xeC}^ = 



= p({x:P„xeC}) n P{{x--cik<{ak,x)<p^}), 
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where k=l,...,n are arbitrary real numbers, P„ is the projector on 

the subspace = n] and C is an arbitrary Bor el 

set in this subspace. 

The proof of this assertion follows from the fact that any admissible 
shift of measure p*, constructed in the proof of Theorem 4, orthogonal 
to a will be an admissible shift of measure v defined in the course of the 
same proof. Note that the measure p defined above is the distribution 
of the variable 



Z fikOu+i", 

k=l 

where are independent random variables with values in and 

densities Pf^{t) and is the variable, independent of rji, with values in 
distributed as the projection of x on is distributed on the prob- 
ability space (^, S, p). 

Assume that an orthonormal sequence of vectors aj, can be con- 
structed such that M^^ contains the linear span of these vectors. Is it pos- 
sible by analogy with the corollary formulated above to assert that 
measure p be equivalent to a measure which is the distribution of the 
random variable 



C= Z (5) 

1 

where is a sequence of independent numerical random variables pos- 
sessing positive densities ? To answer this query we investigate admissible 
shifts of measure p. Denote by 77°® the set of measures which are dis- 
tributions of variables ^ of form (5). 

Theorem 5. If rjj^ are independent random variables with positive densities 
p^[t) and p is the distribution of the variable ^ defined by equality {5), then 
Pa«P Iff ihe series 



Z Dog/>t(f;k-a)c)-logA(»7fc)]> (6) 

k= 1 

is convergent with probability 1, where a^ = {a, a^). If this series is divergent 
then Pa ^P- 

Proof Let p^ and /i” denote the projections of measures p and Pa on the 
subspace spanned by ..., Then 

dK / , A Pk{{x, afc)-gfc) 

fc=i Pk{{x,a^) 
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If is not orthogonal to ft, then the limit 
n-oofeii Pk((x,cik)) 



exist /i- almost everywhere. Moreover this limit is not identically zero. 
Hence the series (6) of independent random variables converges with 
positive probability; therefore this series converges with probability 1. 
(Note that the sequence {x, af) is distributed on the same probability 
space (^, S, fi) as is the sequence Hence if fta«fi series (6) is con- 
vergent and therefore 



djlg 

dfi 



(x) = exp 



y Pfc((x, 

p,{(x,a,)) J 



is everywhere positive, so that fta^p- Conversely, the existence of the 
non-zero limit 



n-*oo CLfl 



follows from the convergence of the series (6). Therefore in view of 
Corollary 1 to Theorem 1 of Section 1 we have that ft«fta and hence as 
it was proved above ft ^ fi^. The theorem is thus proved. □ 

Corollary. If jlL'^ ft where ftell'^, then and (i are either equivalent or 
orthogonal. 

Thus if we construct a measure g such that la^sM^for all X and such 
that an a exists for which the measures g and are neither equivalent 
nor orthogonal, we then construct a measure for which an equivalent 
measure ft belonging to 77°° does not exist. We shall construct an ex- 
ample of such a measure below. 

Admissible shifts of weighted measures. First we consider admissible 
shifts of linear combinations of measures. If fi = then it is natural 

to consider only the case when the measures and are orthogonal 
since under the condition the measures g and jll^ are equivalent 

and hence their admissible shifts coincide ; if however has an abso- 
lutely continuous component with respect to it can be adjoined to 
and the problem is reduced to the case of singular measures. In what 
follows we shall assume that gl _L for all a. (Note that without this 
assumption a connection between admissible shifts of measures tf and 
can hardly be established. Indeed consider the measures 

k = \,2, /i(x)+/ 2 (x)=l, 

/iW/2M=0; 
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here and may have no admissible shifts at all, while may have 
them). If this assumption is satisfied, then M^ = M^inM^ 2 . Indeed it 
follows from the relations and jLij « that = /x] + ixj « +/i^ — 

= jx. Conversely let ^ u E 2 for some a let jx^ (£ 2 ) = 0? (^ 2 ) = 

jx^{E^) = 0 and fij{E^) = 0. Then if aeM^, it follows from the condition 
jx^ (A) = 0 ihsit ju(v4n£i) = 0 and hence 0 = /z^(Tn£i) = /i] (Tn£i). Thus 
/xl<^fi^.ln the same manner jx^ < jx^. Hence M^ = M^^n We now find 

dfi^ djx^ 

the expression of Q^{a, x) in terms of (u, x) = — ^(x) and Q^{a, x) = — |(x). 



dfr 

If the sets and £2 defined above, then 



dpr 



iia{A) = til{AnE^)+ lxl{Ar^ £2) = 




a, x) jx^(dx)-\- 



AnEi 




a, x) fx^ {dx) = 






x)xEXx) + Q^{a,x)E, (x)] n{dx), 



where Xeu indicator of the set E^. Therefore 

2 

Q^a, x)= X Q‘(a, x) Xe,(x). 

i=l 

The result obtained can be easily extended to the case of a countable 
number of measures. 



Theorem 6. Let ju^, be a sequence of pairwise orthogonal measures 
such that for i^k for any ae^. If p — YjPkk^ where Pk>0 and 

00 k 

= 1, then M^= n Mu and one can find pairwise disjoint sets £^ such 

k k= 1 

that ® 

q («’ {^) ’ 0) 

k= 1 

duj' 

where (a, x) = (x) . 

dp2 

Proof The inclusion =) H follows from the fact that ^ pj^p!^ 

k= 1 

«YjPkP^ provided Pa«p^ for all k. On the other hand if for two singular 
measures p^ and ^ pip^ the condition /ij -L Z PiP^ satisfied, then the 

l^k l^k 

shift is admissible for their sum provided it is admissible for each one 



of the summands. Hence M^ciM^k and M^= n Let the set be 

k= 1 

such that = pl^(^i)=l, Yj PkP^{^i) = ^' The existence of such a 

ki-l 
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set follows from the fact that the measures + /xj, and ^ are singular. 

k^l 

Now set Ei = ^i\[J Clearly, and Ei are pairwise dis- 

k^i 

joint. Next 

^^aiA) = 'ZPk^ia{A) = YPk^l■a{Ar^ E^) = 
k 

= J g'‘(a, x) p!‘{dx) = 

AnEk 

=1 [ x) E Pit^\dx)=Y, f W y) p(dx). 

k J I k J 

A c\Ek ^ 

This equality yields formula (7). The theorem is thus proved. □ 

Consider now shifts of measures represented by integrals (rather than 
sums) of families of measures depending on a continuously varying 
parameter. Let 0 be a complete metric space, S be the tr-algebra of its 
Borel subsets. Consider the family of measures on (^, ®), 9eO satis- 
fying the following condition: for any continuous bounded function 

f{x) defined on dC the function J f(x) /x^(dx) is continuous in 6. From here 

it follows that g^(A) is a S-measurable function of 9 for all Ae^. Let 
(r(d9) be a certain (probability) measure on ®. Consider the measure 

g(A)= f i/{A) (r{d9), Ae^. (8) 



Theorem?. Let be the set of admissible shifts of measure ; then 
If moreover, the measures are mutually orthogonal and 

0 

there exists a ^-measurable function 9{x) with values in 0 such that 

g^({x:9{x) = 9}) = \, 

then one can construct a function Q^{a, x), as measurable in 9 and 



•m 0 

x on and such that Q^{a, W (mod/x^) for all 9^0 and 



Q^{a, x) = Q^^^\a, x). (9) 

Proof If aePi Mq, then for all 9. Let g{A) = 0, then ju^(A) = 0 

(7-almost for all 9. But then /i2(^) = 0 xr-almost for all 9. Hence /x^(^) = 0 
and 
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I 



Let be an increasing sequence of finite-dimensional subspaces of 
•) and •) be the projections of measures fil and j/ on these 

subspaces and pl(a, ^ (x). Note that the function 

are continuous jointly in 9 and y, therefore the function 

9i{0, /(«> dx)^ j* Qf,{a, x) dx) 

also possesses this property. The latter function converges, as /->oo, 
jU® (n, • )-almost everywhere, to g^{a,y). Hence the function Q^(a,x) = 
= lim gi{6, x) - where the limit exists - is measurable on ® x S and for 

l-^oo 

each 9 coincides with g^(a, x) //{n, -j-almost everywhere. Set x) = 
= lim Ql{a, x) (where the limit exists); this function is that required. To 

n 00 

derive formula (9) we write the following equality 

J / W 9a (fix) = j* /(x + a) /X (dx) = j* /(x + a) j* / (dx) a{dd) = 



=|[f 



/(x) x) /(dx) 






(j{d9)= f(x) x) p{dx). 



utilizing the fact that x) = g^{a, x) /i^-almost for all x. The the- 

orem is thus proved. □ 

Remark. It is easy to write the expression for g^(a, x) under the condition 
that all j/ are absolutely continuous with respect to a certain measure v 
and the function 

du^ 

g{9,x) = ^{x) 

is measurable jointly in variables 9 and x. In this case 

I f{x) Ma(dx) = J /(x + a) |x(dx) = J* J /(x + a) /(dx) a{de) = 

% 

f(x) /(a, x) g{0, x) v(dx) a{d0)= 



-I 
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f Q%a, x) g{e, x) a{dd) 

f{x) p fi(dx), 

I g{0',x)(T{de') 

since for any bounded measurable function q>{x) 

J cp(x)n{dx)=j j* (p{x) g{e', x) v{dx) a{d6'). 

Therefore under the assumptions stipulated above we have 

j* p*(a, x) g{0, x) a{d9) 

Q^{a,x)=- 



g(6, x) a(dd) 



This result can be generalized as follows: 

Let the family of measures depend on two parameters a and 6 
varying in two separable complete metric spaces j/ and 0 correspond- 
ingly and let a family of measures ju® exist such that the conditions of 
Theorem 7 are satisfied and ff' ^«ii^ for all ae Assume that for each 

continuous bounded function f{x) the integral J f{x) is a con- 

tinuous function of a and 9. Denote by 91 and ® the (7-algebras of Borel 
sets in and 0 correspondingly. Let (r((/a, d6) be a probability measure 
on 91 X ® and let the measure /i on S be defined by the relation 

/i(E) = J (j(da, dO). 

Then H where is the set of admissible shifts of measure 

a,0 

and for there exist functions x) and ^(a, 0, x) 

a,e 

measurable in a, 0, x on 91 x ® x S such that 

(mod^u*’®), 

g{(x, e, (mod/), 




464 



Chapter VII. Absolute Continuity of Measures 



and Q^{a, x) is expressed in terms of these functions by the formula 



Q^(a, x) = 



^ Q<x,e(x) 



{a, x) g{ix, 6{x), x) a (dec 0(x)) 



g(a, 9{x), x) cr(da | 0(x)) 



(10) 



where a (da \ 6) is the conditional measure defined by the relation 

0) (j{d(x I 6) d0) = J ij/(oL, 6) (j{d(x, dO), 

which is valid for any function 9) measurable in variables a and 9 
and 0(x) is a measurable function for which jU^({x :9(x) = 9}). Formula (10) 
follows from the following string of equations 



= J J J /(x) x) g(a, 9, x) fi^{dx) a(dec | 9) a(A, d9) = 



% 



f{x) x) g{(x, 6{x), x) x 

X (7 (da I 6(x)) n^(dx) a (A, d6) = 



I x) g{a, 6{x), x) a{d<x\ 6{x)) 

= J f{x) ^ n(dx). 



(* 



9(x), x) a (dec I 0(x)) 



and the existence of the functions x) and g{oc, 9, x) is proved in 

exactly the same manner as the existence of function (a, x) in Theorem 7. 

As it follows from formula (9) the density x) of the measure given 
by formula (8) does not depend on measure cr. If and jU 2 are two mea- 
sures defined by the relation 

= ^ /(x4) <Tfc(d0) 

and (Ti (72 then ^Iso and moreover 



dni 






where 0(x) is a function such that //®({x:0(x) = 0}) = l. Since g^^{a,x) 
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dfi 



dn2 



and x) coincide, we have —^{x — a) = - — (x) and this equation is 

dfii dfii 

valid if for all aeM^^ and almost all x the equality 9{x — a) = 6{x) holds. 
It turns out that this case is general to a certain extent. Assume that 
measures v and fi are equivalent and that for some a, XaeM^ for all real X 

dv 

and {a, x) = {a, x). Set (p{x)=— (x). Let Ef = {x:q) (x) = t} and a be the 

measure on the real line such that 

cr((- CO, f)) = /i(U E,), 



and the family of measures is determined by relation 
)u(An {x:(/)(x)Gyl}) = J (j(dt). 



( 11 ) 



which is valid for all Ae^ and all Borel sets A on the real line (i.e. /i' 
is the conditional distribution of x on the probability space (^, ®, fi) 
given (p[x) = t). We show that a is an admissible shift of measures 
c-almost for all t. It follows from Theorem 4 that measure can be 
represented in the form 



H(A) 



x + saeA, 
xeSl^o 



f{x, s) fi(dx)dF{s), 



dfi 

dfi* 



where/(x, s) = ^P^(x-hsa), and is the measure on the subspace 



^Q = {x:(a, x) = 0} determined by the relation jl(A) = iii{P~^A) and P 
is the projector on while F{s) = fi({x:{a, x)<s}). Analogously to 
representation (11) we have 



p.{A) = 



fL^{A) a{dt) 



(Note that the equality (jp(Px) =(p(x) /x-almost for all x yields the relation 
ii{\J E,) = fi{{x:(p{x)<t}) = ti{{x:(p(x)<t}) = a{{- CO, t)). 

s<t 

Let / (t, X, s) be a measurable function such that / (t, x, s)=/(x, s) for 
(p{x) = t. Then 



m(^)= Jj* f{t,x,s)p‘{dx)a{dt)dF{s) = 



x + sae A 
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Setting 




/ (t, X, s) fi^{dx) dF{s) 



X + sa e .4 



(r(dt). 



J / (t, X, 5 ) iF (dx) dF (s) = {A ) , 

X + sa 6 A 

we obtain 

= | l^*‘(A)(j(dt). 

Hence for each measurable set on the line A and each continuous func- 
tion ^(x) the equality 



^(x) i/{dx) a{dt) = 



g{x) fi"^\dx) (j{dt), 



A 



A 



is satisfied, i.e. J ^(x) fi^(dx) = j g{x) g"^'{dx) (T-almost for all t. It follows 

from the representation of that a is an admissible shift for this measure. 
Finally note that 



v(A)= (p{x)g{dx) = 



= Xa{^) (p{^) c{dt)= fi^(A)(7^{dt) 






where (t) = t. We have thus proved the following 

da 



Theorem S. If g and v are two equivalent measures for which the sets 
and satisfy M^ = M^ ( these sets are linear manifolds) and x) = 

= x), then there exists a one-parametric family of measures such 

that aeM^t and a measurable function (p(x) such that pl{{x:(p{x) = t})= 1 
and also equivalent measures ( on the line) a^ and a such that 




p){A) a(dt). 




p^(A) (Ti {dt). 



Moreover the function (p(x) satisfies the equality q){x — a) = (p{x) for all 
aeM^ and p-almost for all x. 
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A sufficient condition for admissibility of the shift. Consider measure jll 
on S). Let {e„, n=l , be an orthonormal basis in be the 

subspace spanned by be the projector on this subspace and 

the projection of measure fi on Assume that is absolutely con- 
tinuous with respect to Lebesgue’s measure on and that its density 
f„{x) with respect to this measure is positive. Then 



’ f„{x) ’ 






In view of Theorem 1 in Section 1 the limit 



lim~ ^''^^”|^ a) exists /i-almost everywhere. 

n-*co JnV n^} 

If aeM^, then g{x, a) = g^{a, x) and in order that asM^ it is necessary 
and sufficient that f g(x, a) jLL{dx) = l (cf Corollary 2 to Theorem 1 in 



Section 1). It is difficult to verify this condition. Below conditions are 
presented which guarantee that taeM^ for all real t. These conditions 
are based on the remark following Theorem 2 in Section 1. 

Assume that the density /„(x) is continuously diffentiable with respect 
to X and denote by V/„(x) the gradient of /„(x) i.e. a vector in such that 

^/„(x + t< 2 )|f=o = (y/«(x), a) for all Set for all xe^ and 



K a) • 

We show that the functions /i„(x, a) for a fixed a and a running n form 
a martingale on the probability space (^, S, g). Denote by ®„ the 
(T-algebra of cylindrical sets with bases in Let 0Ck = {^, ^/c)? 
fni^) = Fn(h,--.tn)‘ Then 



1 ” 5F 

In) k=l Otk 



If (p{x) is a ®„-measurable function then <p(x) = d>(ti, ..., t„). First assume 
that d>(ti,..., is continuously differentiable which vanishes every- 
where except for some bounded region. Then 

Eh„+i{x,a)(p{x) = 



1 



Pn+l (fl> • • •> t»l+ l) k=l 
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n+ 1 “ 

= Eh„{x,a)(p(x) + cc„+i 

J ^^n+1 



F n+l(^U •••? ^n+l) 



r d 

^n+l I ^n+1 ^ 

J ^^n+1 



^ (f 1 , . . . , t„) 1 . . . + 1 = E/z„ (x, a) (jP (x) . 



Since the functions cp(x) of this type are everywhere dense in the space 
of all bounded ©^-measurable functions then the equality 

E/i„+i (x, a) (p(x)= E/i„(x, a) (p(x) 

is valid for all bounded ©„-measurable functions. From the last equality 
follows relation 



E{h„+i{x,a)\^„)=h„(x,a), 

i.e. h„{x, a) is a martingale. 

Denote by N the set of a such that 



sup J (x, a) fi (dx) < oo . 

For all aeN the limit h{x, a)= lim h„{x, a) exists /x-almost everywhere 

n-*^ 00 

and moreover, the sequence {hi{x, a),..., h„{x, h{x, a)} is also a 
martingale on the probability space (^, S, ja) and 



/* 

lim \h (x, a) — h„{x, aj] ^ (dx) = 0 . 
n-^co J 



It is easy to verify that iV is a linear manifold and moreover for each real 
t and for a and in iV the relations 

h(x, ta) = th(x, a), h(x, a + b) = h(x, a) + h(x, b). 

are valid /x-almost for all x. The next theorem presents sufficient condi- 
tions for the inclusion aeM^ in terms of the function h{x, a) defined 
above. 

Theorem 9. Let the densities /„ (x) be positive and continuously different- 
iable and let h[x, a) be defined for some aeSC. If for some ^>0 



^d\h(x,a)\ 00, 
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then taeM^for all real t and formula 

Q{ta, x) = exp| — J h{x — sa, a) ds^ 



is valid. 

Proof. Set 



h{t) = 






( 12 ) 



In view of the assumptions of the theorem the derivative r„(t) exists and 
moreover 



m= 



h„{x-\-ta, a) 



3T n 



fnjx) 

Mx)+Mx + ta) 



f„(x)dx. 



Hence 



\m\^ 



\h„{x+ta,a)\f„(x)dx. 



3Cn 



We shall utilize the following inequality due to Young*: if g(t) is a 
function defined for all continuous, strictly increasing with ^(0) = 0 
and g~^(t) is the inverse of g then for all a >0, ft >0 we have 

a b 



g{t)dt+ g ^ 



{t)dt. 



Putting 9{t)=- ln(l + t), with a>0, we obtain 

1 \ e°^ — l 

ab^- ln(l +a)H . 

a a 

Utilizing this inequality we have 

d'n (t ) ^ J* \K{x + ta, a)| - fn (x + ta) dx < 



* W. H. Young, Sur la generalization du theoreme de Parseval, Comp. Rendus 155 (1912) 
30-3 (or Proc. Royal Soc. (A) 87 (1912) 225-9.) Translator’s Remark. 
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.If u 

'<5j fn{x- 

3Cn 

+-j* /.(x+fa) dx = 

x„ 

where 5 > 0. 

From this inequality involving the derivative we have 






f,d\hn(x,a)\ 



fi{dx) 1 e 



Sinee {hi(x, a),..., h„(x, a),..., h{x, a)} is a martingale, it follows that 



^d\hn (x,a)l 




fi{dx) 



so that 



sup 

n 



I 



^ J„{x tP„a) Ux tP„a ) 



/«W 



/bW 



^sup 

n 




fnjx} 

f„{x+tP„a) 



fn (x) dx ^ sup /„ (t) < 00 . 

n 



To verify that aeM^ one need merely apply the remark following 
Theorem 2 of Section 1. 

We now proceed with the derivation of formula (12). First note that in 
view of relation 



Q {{t -\-a) a, x) = Q {ta, x) q {sa, x — ta) 

it is sufficient to establish this formula for t arbitrarily small. On the other 
hand utilizing the homogeneity of h{x, a) in a, we may assume that the 
condition of the theorem is satisfied for 5 sufficiently large, for example 
(5 = 4. Utilizing the definition of /i„(x, a) we have 



L{Pn{x)-ta)) 

fniPnX) 



= exp 



t 



sa, a) ds 
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Therefore, 



g(ta, x)= lim exp 






To verify the existence of the integral 
and the fact that 



{x — sa, a) rfs > . 



h{x — sa, a) ds ju-almost for all x 



r r 

lim h„{x — sa, a) ds= \ h{x — sa,a)ds 
n-*oo J J 



0 0 

in measure ju, it is sufficient to show that relation 



lim 

n, m-^ oo J J 



I 



\h„{x — sa, a) — hn+fn{x — sa, a)\ ds fi{dx) = 0 (13) 



is satisfied. (Indeed it follows from the ©-measurability of h{x, a) that 
h(x — sa, a) is a Borel function in 5, h„{x — sa, a)^h{x — sa, a) for each s 
and /z-almost for all x; hence in view of Fubini’s theorem this holds ju- 
almost for all x and almost for all s with respect to the Lebesgue measure; 
it follows from (13) that the integral 



\h{x — sa, a)\ ds 



is finite and the equality 



lim 

00 



\h„{x — sa, a) — h{x — sa, a)\ ds /x(dx) = 0 is valid.) 



We have 



lim f 

n, «-^oo J 



\h„{x — sa, a) — /i„+^(x — sa, a)\ ds fi{dx) = 

t 

f c 

= lim \h„(x,a)-h„+„{x,a)\ f„+„(x + sP„^^a) ds dx. 

n,m-^oo J » 

0 
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We now again utilize the following form of Young’s inequality 

e^^ — l 

ab^ \-b ln(l+a6). 

a 



\h„{x, a)-h„+„(x, a)\ f„+„(x) ds 4x< 

Jn + mV^) 



I lQxp{\h„(x, a)-h„+^{x, a)\} - 1] fi{dx)-h 



j j/„+m(x, 



) Inf 1 + a— — r') ds dx. 

V fn + m{x + sP„ + „a)J 



exp{2|lt„(x, a)-h„+„{x, a)|}<iexp{4|;i„(x, a)|}+^exp{4|fc„+„(x, a)|}. 
Therefore the integral 

(exp{|li„(x, a)-h„+„{x, a)|})^ n(dx) 

is uniformly bounded and hence the function exp{|/i„(x, a)~h^+„{x, a)|} 
is uniformly integrable and the limiting transition under the sign of the 
integral is justified. Consequently, 



lim exp{|/i„(x,a)-fc„+„(x,a)|};i(4x)=l, 
n, m-*^ ao J 

since \h„(x, a) — a)|~>0 as n-^co and m^oo in measure fi. 
Therefore 



t 

I j* \K{x,a)-h„^ 



^(x, a)\ ds fi{dx)^ 



:lim [ I* f„(x) ds dx. 

n^aoj J \ f,(x-sP„a)/ 



Since the sequence rj„=- ~ 7 - -- -- - is a martingale on the probability 
Jn 
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space (^, S, and rj^=Q{ — sa, x — sa) = limrj„ is such that Erj^ = 
lim Er]„=l, this sequence is uniformly integrable and hence, in view of 

n-* 00 

Theorem 3 in Section 2 of Chapter II, the sequence [fy„, n= 1,2,..., oo] is 
also a martingale. Therefore the sequence ln(l+a^„); n=l,2, ..., oo] 
is a semi-martingale since sup Erj„ ln(l -ha^„)< oo (the boundedness of 

n 

supEri„\n{l-\-(xri„) is proved analogously to the boundedness of 

n 

actually for a ^ 1, which is the case here, it follows from the boundedness 
of Therefore 

Eri„ ln(l+a>7„)^Ef;^ ln{l+xr]J, 
lim Ef/„ln(l + a/ 7 „)=E>j^ ln(l+a» 7 j 



(the mathematical expectation is taken in the probability space (^, ®, 
IX J). Since 



Etj„ ln{l + ccti, 
it follows that 



J=| /nW 



1 + a ^ — - 1 dx, 

fn{^-sPna)J 



lim f 

n~* 00 J * 



f„{x) ln( 1 +« , 1 dx ds = 

f„(x-sPA 



=j* J* ln(l+ae(— as, X— as)) 
0 



Hence 



lim 

n, m-* 00 J 



jlK 



{x — sa, a) — h„+f„{x — sa, a)\ ds ju(dx)^ 



J ln(l-l-a^( — as, x — sa)) fisa{dx) ds. 

0 



(14) 



It is easy to verify that the integral 



In ( 1 -h ( — as, X — sa)) (dx) 



approaches zero monotonically as ajO. Approaching the limit in (14) as 
aj,0 we obtain (13). The theorem is proved. □ 
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Corollary. Let be the Gaussian measure with mean value 0 and correla- 
tion operator B. Then = 

Proof. Since the characteristic functional of measure p is of the form 
(p(z) = Qxp{ — j{Bz, z)} and (p(z)^l as {Bz,z)^0 we have, in view of 
Theorem 2, the inclusion Denote by ^ 2 ? ••• 2i, 22> ••• 

the eigenvectors and eigenvalues of the operator B, respectively. If 
is the subspace spanned by then 



/.(x)-(2,)-«| n A.)‘“exp|-i Z 



.fc= 1 



2,t 



Therefore 



h„(x,a)=- 2. ^ . 

k=l 



It is easy to verify that 



I 



{h„(x,a)f n(dx)= ^ , 

k=l 



SO that h(x, a) is defined and 



h(x,a)=- Y, 



(x, ek) (a, fit) 



provided only Y, <oo. Since (x^e^) are independent Gaussian 

fc=i ^k 

variables on the probability space (^, S, p), h{x, a) is also a Gaussian 
variable. Therefore 



I 



, a)| 



p{dx)^ 



j-^fi (X, a) _|_ g h {X, a) j ^ _ 






= 2exp<j2 I h^x,a) n{dx)\ = 2expl- Y — 



Thus in view of Theorem 9, aeM^ provided only Y i-®- if 



fc=i K 



Our assertion is proved. □ 

It follows from (12) that 

(x, e^) {a, eQ 1 y ja, e^f 
= exp {{B~ B~ ^^^x)—^\B ~ . 



^^(u,x) = exp< Y 
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Finally we present an example of a measure for which is a linear 
manifold everywhere dense in 3C and a vector a exists which does not 
belong to and such that \i^ is not orthogonal to fi. Consider Gaussian 
measures and with mean values 0 and correlation operators A and 
B. We assume that the eigenvectors for both operators are the same, 
denote these eigenvectors by e^, ^ 2 ? • • the corresponding eigenvalues 

will be denoted by a^- and Let ^-^0 as n-^co. It is easy to verify that 

Pn 

and — is a non-empty set. If — then 

liij ^ in^ and 1 jii^ (cf. corollary to Theorem 5). We show that JL)U^. The 
variables (x, ej,) = are Gaussian on each of the probability spaces 

and ®, Jill} and, moreover — ^ = on the 

first and E^,^ = (a, and Ndir^^^^k on the second of the spaces. There- 

fore 



in measure and 



1 ” {x,e^f 

^k=i l^k 



1 y ^k) 

n k=i i? 



in measure /zj. The orthogonality of /z^ and /z] is proved. Note that it 
follows from the presented proof that /z^±/z^ also. But then, in view of 
Theorem 6, the set of admissible shifts of measure ^ = ^(/z^ + /z^) coincides 
with M^iPiM^ 2 , i.e. with M^i. If, however, — then /z^ possesses 

an absolutely continuous component ^/z^ with respect to measure /z so 
that /z satisfies the required conditions. 



§3. Absolute Continuity of Measures under Mappings of Spaces 

The basic problem considered in this section is the study of conditions 
under which the mapping of a Hilbert space ^ into itself transforms 
measure /z into an absolutely continuous measure with respect to fi. If 
r(x) is a measurable mapping of into i.e. a mapping such that 
T~ ^ (^)g ® for each ^ e®, then the measure /z under such a transforma- 
tion is translated into measure v defined by equality 

v(A) = m(T-^(A)). (1) 

Below sufficient conditions will be found which assure the absolute con- 

dv 

tinuity of v with respect to /z and an expression for — is obtained by 

dfji 

means of characteristics of measure /z and the mapping T. 
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Before studying measures in infinitely dimensional spaces, we shall 
obtain a solution of the posed problem in the case of a finite-dimensional 
Euclidean space. Let the measure /z possess a density with respect to the 
Lebegue measure 

n(A) = ^f{x)dx, 

A 



and, moreover / does not vanish. We shall also assume that the trans- 
formation T is one-to-one and continuously differentiable. Then for a 
measurable bounded function g 



j* g(x) v(dx)=| g{T{x)) /x(dx)= j* 0(T(x)) f(x) dx= 

\DT-^{y)\ 



9{y)f{T-Hy)) 



Dy 



dy. 



where 



DT-\y) 



is the Jacobian of the inverse of T which is also differ- 



Dy 

entiable in view of the imposed assumptions. The last integral in this 
chain of equalities can be written as an integral in measure g; 



9{y)f{T ^3^)) 



DT-^jy) 

Dy 




DT-\y) 

Dy 



g{dy). 



Note that 



fjT^Hy)) f{y-{y-T-^iy))) 
fiy) f(y) 



= Q{y-T-\y),y), 



where g{a, x) is the density of measure with respect to measure g (we 
utilize the notation of Section 2). Therefore in a finite-dimensional space 
the following formula 



j* g{x) v(dx)= J g(x) q(x-T ‘(x), x) 



DT-^{x) 

Dx 



fi{dx) 



is valid. In this form the formula makes sense also in the case of a Hilbert 
space provided we assign a suitable meaning to the Jacobian of the trans- 
formation. Let F be a linear operator such that V—I, where I is the 
identity transformation, is completely continuous. Then FF* is a sym- 
metric nonnegative operator and FF* — / is also a completely continuous 
operator. Let be a sequence of eigenvalues of the operator FF* (this 
operator possesses a complete system of eigenvectors), We 
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set 



|detK|= /n 

V fc=l 



provided this infinite product is either convergent or divergent to 0 or 
to +00. 

Let S'(x) be a mapping of ^ into 3C. This mapping is differentiable at 
point Xq if a linear operator (IS{xq) exists such that relation 

I S (xq + x) — 5 (xo) — dS (xq) X I = o (|x|), x g ^ , 



is satisfied; operator dS{xo) is called the differential of 5(x) at point Xq. 
If ^ is a finite-dimensional space then, as it is easily seen, the Jacobian 
of the transformation S(x) coincides with |detti5(x)|. The latter makes 
sense in a Hilbert space as well. Thus we arrive at formula 



I 



g{x)v{dx) = j 



g{x) q{x — T ^ (x), x) |det d r ^ (x)| g (dx ) . 



( 2 ) 



The validity of this formula for a sufficiently wide class of functions g 
leads to the equality 

dv 

— (x) = g(x— T~^(x), x) jdetdT~^(x)l. (3) 

dg 

The remainder of this section is devoted to the investigation of con- 
ditions under which formula (3) is valid in the general as well as in the 
case of Gaussian measures g. In order that formula (3) make sense, 
certain general conditions must be imposed on measure g and on trans- 
formation T. 



Condition 1. Measure g possesses a linear manifold of admissible shifts 
M and for each orthonormal basis of the projection g^ of measure 
g on possesses a continuous density y^(x) with respect to the Lebesgue 
measure on (here is the subspace spanned by the vectors , ^ 2 ,. . 
Moreover, for each oO and each finite-dimensional subspace 7V<=M, 



sup 



fn(PnX) 



~Q(a, x) 



|a|^c, aeN 



0 



in measure g, where P„ is the projector on and 

/ \ dg^ , V 

Condition 2. The transformation T(x) possesses the inverse denoted by 
S(x); operators T(x) and ^(x) are locally bounded and continuously 
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differentiable; also quantities \6QidT{x)\ and |det(i5(x)| are finite, non- 
zero, continuous and locally bounded. 



Theorem 1. Let conditions 1 and 2 be satisfied and let a finite dimensional 
subspace N exist in M such that x—T{x)eN,x — S[x)eN for all xe^ and, 
moreover, let for the projector P on N, PT{x) = T{P(x)), and P(S(x)) = 
= S{P{x)). Then and formula (3) is valid. 



Proof We choose the basis ^ 1 ,^ 2 ,... such that for some m the vectors 
^ 1 ,^ 2 ,...,^^ form a basis in N. Let /i" and v" be the projections of mea- 
sures p and V on the subspace Then for « >m and for any measurable 
bounded function g defined on we have 



J 



g{x) v"(4x) = 



9{P«x} v{dx) = 



d(P„T{x)) n{dx) = 



5T 






= g{T{P„x))Mdx) = 



g{T{x)) g"(dx) = 



J 






= d{T{x))f„{x)dx. 



3T„ 



Under our assumptions the transformations T and S map into 
for n>m. Changing the variables of integration by means of x = 5(};), we 
obtain 






\dQt dS{x)\ p"^{dx). 






Since S{x)=x-^P{S(x) — x), the Jacobi transformation matrix for 5(x) 
is of the form / + U where V has non-zero elements only in the first m 
rows and these elements do not depend on m. Thus for n > m the absolute 
value of the Jacobian of the transformation S(x) does not depend on n 
and coincides with |detdS(x)| . 

Hence 



X fnjSjx)) 
’ f„{x) 



ldetdS(x)|. 



Clearly measures v" and are equivalent and 



Therefore, 



/„(S(x)) |detdS(x)r 



dv " ' ^ ’’ f„(x) ldetdS(T(x))| 



fn(T{x)) 

fn{x) 



{dot dT(x)\, 
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since according to the rule of differentiation of a composite function 
I = dx = d [5(T(x))] = dS{T{x)) dT{x ) . 

In view of condition 1 the following limits 



lim 

00 



lim 

n-*oo 



fnjPnSjx)) 

MPnX) 

fnjPnTjx)) 

fn{PnX) 



= q{x — S (x), x) , 
=^(x— r(x), x) 



exist, in the sense of convergence in measure fi; moreover these limits are 
different from zero (here we utilize the facts that x — S{x)eN and 
X— T(x)gAT, that these functions are locally bounded and condition 1). 
Applying Corollary 1 to Theorem 3 in Section 1 we complete the proof 
of the theorem. 

Remark 1. We have proved that under the conditions of Theorem 1 
V ^ ju and in addition to formula (3) the following formula 

^(r(x))=e(x-T(x),x) |detdT(x)| (4) 

dv 



is valid. 

Remark 2. Formulas (3) and (4) remain valid if for any y and d and finite 
dimensional subspace N^<^M can be found such that the conditions of 
the theorem are satisfied for |x — provided N is replaced by N^. 

Theorem 2. Let conditions 1 and 2 be satisfied and let the basis {ej,} of 
vectors in M exist such that 

1) for n sufficiently large the mappings r„(x) = x+P„(r(x) — x) and 
5„(x) = x-l-P„(5'(x) — x) are invertable and |det<ir„(x)|-^|det JT(x)|, 
|det^/»S„(x)|-^|det J5'(x)| in the sense of convergence in measure p. 

2) expressions ^ (P„ (x — S (x)), x) and ^(P„(x — r(x)), x) possess limits 
as n^ CO in measure p which will be denoted by ^(x — *S(x), x) and ^(x — 
— r(x), x) correspondingly ; we can substitute x by T{x) in q{x — S{x), x) 
and 



q{T{x) — x, r(x)) ^(x— r(x), x)=1 . 

Then the measures p and v are equivalent and the formulas (3) and (4) are 
valid. 

Proof. Let 

S:{x) = X + P„{S{Pn,x)-x). 
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For m>n 

The mapping and its inverse satisfy the conditions of Theorem 1 . If 
vIJ* is the measure determined by the equality 

v-:(A)=fx(s:{A)), 

then 



dv^ 

(x) = ^ (x - Sr (x), x) I det dS^ (x)| . 

Note that condition 1 and the fact that /„(x) are continuous and positive 
for all n imply that g(a, x) is uniformly continuous in a for |u| 

in measure fi, i.e. 

sup{|^(ai, x)-Q(a2, x)|; \a^\^c, -^ 22 ! 

in measure fi as 5-^0. Since x — Sn{x)G^„ for all m, x — S„ (x)-^x — S„{x) 
and x — Sn(x) are bounded in measure /i, it follows that 

q{x-S;P(x), x)->-q(x-S„{x), x) 

in measure ja as m-^oo. Next it is obvious that dS^{x) is of the form 
dS^{x) = I-\- V^{x) for all m, where V^{x) maps the whole space into 
It can be verified that in the case when V maps ^ into we have 

|det(/+ F)H|det||((/+ 

where ||((/+ V) is the matrix of order n with the /,y-th 

element being ((/+ V) e^. Since for all i and j 

lim (^^“(x) e;, ej)=(dS„(x) e;, ej), 

m-*oo 

it follows that 

lim |det</5’„'"(x)| = |detJ5'„(x)|. 

m-*co 

Therefore 



dv^ 

lim (x) = q(x — S„ (x), x) I det dS„ (x) | 

m-* 00 djJ, 

in the sense of convergence in measure fx. Utilizing the equality x — ^^(x) = 
= P„ (x — ^(x)) and the conditions of the theorem we may assert that 

dv^ fx) 

lim lim — ^ — = ^(x — 5(x), x) |det6?5'(x)| 

n^co m-* 00 dfX 
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in the sense of convergence in measure fi. Hence one can choose sequences 
rij, and such that nij^ > rij, and measures which satisfy the re- 

lation 

dVh 

lim (x) = ^ (x — S' (x), x) I det dS (x) | 

fc^oo dfx 



m measure fi. 

Now let TJ^^{x) = x + P„{T{P^x) — x). Analogously we can show that 
one may choose sequences and in such a manner that the measures 

v,{A)=^l{T,-^(A)), where T, = TT: 

satisfy the following 

lim (7^(x)) = ^(x — r(x), x) |detJ7’(x)| . 

fe-^oo dvj^ 

in the sense of convergence in measure ii. 

Finally, 

q[T{x) — x, F(x)) |det JS(r(x))| ^(x — r(x), x) |det Jr(x)l = 1 , 
since according to the rule of differentiation of a composite function: 
/= dx = ^/(S(r(x))) = JS(r(x)) dT[x ) , 
and, hence, |det^/S(r(x))| • |det^/r(x)| = 1, while 

q{T{x) — x, T(x)) g(x — T(x), x)=l 

in view of condition 2) of the theorem. Hence one may utilize Corollary 2 
of Theorem 3 in Section 1 which yields the proof of the theorem. □ 
Consider now the case when the measure // is Gaussian with the mean 
value 0 and correlation operator B^. We have shown in Section 2 that 
in this case M=B^ and if a = Bb, are the eigenvectors and Pj, the 
corresponding eigenvalues of the operator B, then 



g{a, x) = exp 



= exp 



y (q, et) {x, e^) 1 ” (a, e^f \ 
_h Pi ih Pi J 

.k=l Pk ^ ) 



Let the transformation T(x) be of the form T{x) — x + B^x), where 2(x) 
is a continuous and continuously differentiable mapping. If T is invert- 
able, then S'(x) = x + B2,*(x), where 2,*(x)= — ^(^(x)) is also continuous 
and continuously differentiable. f — a) 



Since in the case of a Gaussian measure the function In- 



r / \ 



IS 
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a sum of quadratic and linear functionals (in a) it follows from the con- 
f (x Cl) 

vergence of — — — ^ to Q(a, x\ that this convergence is uniform in a for 
fn{x) 

jn") Ml < C, for any m and c. Therefore for Gaussian measures condi- 
tion 1 of Theorem 2 is always satisfied. Consider now condition 2 of the 
same theorem. 

Since 

e(P„(x-T(4x) = expj- t 

{. k=l Pk ) 

the existence of the limit 

lim Q{Pn{^ — T (^)), 

00 

(different from zero) is equivalent to the convergence in measure // of 
the series 



and this limit is equal to 

5(x-T(4x)=exp{- i 

L k=l Pk J 

In the same manner the lim^(P„(x — S(x), x) exists, iff the series 



Z (A*(x),e*) 



is convergent in measure p and moreover 

p(x-S(x), x)-exp{4 

L k=l Pk J 

Note finally that 
q{T{x)-S(T{x)), T{x)) = 



= exp< - Z 



{X*(T{x)),ek){T{x),et) ^ 



-il 4 *(r(> 



= exp< Z 



{X{x),e^){x + BA(x), e*) ^ 



ilA(x)|4 = 



I V ('<•{4 et) (x, e*) , 

-exp-^ 2. ^ + 

U=i Pk 
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k=l Pk 



= {q(x-T(x),x)) K 



We have thus proved the following theorem: 

Theorem 3. Let p be a Gaussian measure with mean 0 and correlation 
operator ; let and e^ be the eigenvalues and eigenvectors respectively 
of the operator B. If 

a) the transformation T{x) satisfies condition 2 and is of the form 
T{x) = x-\- BX (x) and S (x) = ^ (x) = x + BX"^ (x) ; the transformations 

T; (x) = x+ P^BX (x) , (x) = X + P^BX* (x) , 



are invertable for n sufficiently large and finally 

I det dT^{x)\-^\ det dT{x)\, \ det dS^ (x) | -> | det dS (x) | , 
b) the series 



V ^k) {x, e^) 

h Pk 



and 



y (A*(x), gfc) (X, gfc) 

P, 



are convergent p-almost everywhere, then the measures v and p are equiv- 
alent and 



— - (x) = |det ^iS'(x)| exp 
dp 



Ir= 1 Dv 



( 5 ) 



We now discuss certain sufficient conditions for convergence of the 
series appearing in b). These will be used to check whether condition b) 
is fulfilled. 



Lemma. If measure p is as in Theorem 3 and X{x) is a continuous mapping 
of ^ into , then for the convergence of series 



y (A(x), e^) {x, e^) 



( 6 ) 



in measure p it is sufficient that one of the following conditions be fulfilled: 

00 

1) the numbers ol^ exist such that ^ ocl <co and the series Y, 

k= 1 

X (A(x), ej^)^ is convergent p-almost everywhere. 

2) the series 

X (A( A(P.- 14 J (A(P,_ e,f . 

fc=l Pk fc=l 



are convergent in measure p. 
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3) the series 



Z J [('^.7(4 ^i) ej)-(Hx), e,) (/I (4 ej)] n(dx), 

f f {X{x), fi(dx), 

k=i J Pk 

where >l^j(x) = A(x — (x, e) e^ — {x, ej) are convergent. 

Proof. 1) follows from the inequality 






; {x, f^mx), e^Y 



and from the fact that 



“ PI V.t-i 4 



Z ^k^^p{dx)= Z a* <00. 
fe=l J Pk k=l 

2) Since the convergence of series (6) on the set {x:|/l(x)p=^c} for 
any c> 0 implies its convergence in measure p, we may assume without 
loss of generality that |A(x)|^=^c. Denote by the set of x such that 

00 

k= 1 

Let 2^(x) = /l(x) for xeH^, 2^(x) = 0 for x^H^. Since the convergence of 

00 

the series ^ implies that 1 as 00, in order 

k= 1 

the series (6) be convergent it is sufficient that the series 

V n t \ 

k=l Pk 

be convergent for each m in measure p. However, 
k=l Pk 

-I (r.(*)-a.(P.-.4^+E a.(p-,x» ■!.)—. 

k=l Pk k=l Pk 

Note that in view of the inequality 



X (/(n_iF,x),g,f< X (HPu-,x),e,y + 
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+ E E mPk-ix),euf + \X{Pi{x)\ 



for all I and xeH^WQ have Let Then 

k=l Pk k=l Pk 

and by assumption the last series is convergent. Since m can be chosen 

00 

arbitrarily large it is sufficient to show that the series ^ (Pj^ _ ^ x), x 

k=l 

^k) . 



Pk 



is convergent in measure //. However convergence of this series 



follows from the fact that its marginal sums form a martingale on the 
probability space {^ , ®, ju}. Indeed (x, are independent Gaussian 
variables and moreover 

iUPk-,x),e,)^^Y = E E iUPk-kX),e,r^^- 



Pk 



Pi 

= E E(UPk-kX),e,rE^^^m. 

k=l Pk 



3) We show that series (6) is mean-square convergent. Clearly, 

«fc) (x, e^)Y 



Kl.~ 



Pk 



_y f(^(4 
k = n J 



jx{dx) = 

^kf {x, ekf 



Pi 



^{dx) + 



+ 2 X f ^(d::) I 

n^KJ^m J PiPj 

E f [(^(^)> «i) ^j)-i^ij{xl «i) Yij{x), ej)] X 

n^i< j^m J 

n{dx)= 



+ 2 
(x, ej (x, ej) 



PiPj 



y (A(x), {x, e*)" 

E ^2 f^{dx) + 



PI 



+ 2 



E [(>1(4 ^i) (>^(4 ^j)- 

n^i<j^m 



/, ..A .W, /-A .n(^’^')(^’^j),./J,A 

’(4(4 ^i) (4(4 ^j)] pp /i(dx). 
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Since 



J ei) Bj) {x, e,) {x, Bj) fi(dx)=0 

in view of the independence of the variables ^ij{x), (x, and (x, ej) on the 
probability space {^, S, /x}. Hence 

Y y (zl(x)j B^) {X, B^ 

J \hn Pk 

as n 00 and m -> oo. The lemma is thus proved. □ 

We apply Theorem 3 to the case when the transformation T(x) is 
only slightly different from the identical transformation. Let a family of 
transformations 7^(x) = x + 8A(x) be given, then Sg(x) is of the form 
5 £(x) = x-cA(x)-|- 0(8^) (only terms of order not higher than s are 
considered). If dA{x) has a finite trace, then 

In |det x/ 7^ (x)| = ^ Sp (x)]* 

fc=i k 



li{dx)^0 



for 8>0 sufficiently small. Consequently, taking into account only the 
terms of order no higher than 8, we may write 



— (x)=l-8Sp[rfA(x)]-£ X ^ + 0(£^). 

djl k= 1 Pk 



As it is seen from this formula the basic difficulty encountered in applying 
Theorem 3, namely verification of the convergence of the series appearing 
in condition b) of this theorem remains intact for transformations which 
are arbitrarily close to the identity also. 

Now consider the case when transformation T is linear. 



Theorem 4. Let n be a Gaussian measure with mean value 0 and a positive 
correlation operator If the linear operator T is invertable and is of the 
form T= 1+ BCB~^ where SpCC*<oo and if the 7+C possesses a 
bounded inverse, then 



^(x) = Ktxp{W(x)}, (7) 

where 

K= lim |det(/+ D„)\ , D„ = P„DP„, (8) 

00 

and 



W{x)= lim [-(DB-ip„x, B-^F„x)-i|P„DB-'F„xp + SpD„]. (9) 

n-^co 
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This limit is taken in the m.s. sense in measure fi, P„ is the projector on 
andD = B-^T-^B-L 

Proof. Operator T maps M into M so that D is defined at least on M. 
We show that SpDZ)*<oo. Since 

T-^=I-BCB-^T~\ D = CB-^T-^B = C{I+C)-^ = CV, 

where F is a bounded operator, we have 

00 00 

SpI»D*= X {D*e^,D*e^)= ^ (F*C*e„ F*C*e,)^ 

k=l k=l 

00 

^||FF*|| X \C*e^\^=\\VV*\\SpCC*. 

k= 1 

From this relation it follows that D is bounded. Set = P^DP^. Then 

Sp(Z)-Z)„) {D-^D„)* = SpDD*-SpD„D* = 

00 00 

= X \D*e^?~ E \D^e^\^ = 

k=l k=l 

00 00 00 n 

k= 1 j= 1 k= 1 j= 1 



as n-^cc. 

We show that the limit 

lim |det(/+Z)„)| exp{- SpZ>„} 

M-> 00 

exists which is different from zero. Let U„ = D„-\- D* D„D* . Then 
|det(/+Z)„)| exp { - Sp £)„ } = V det (/ +U„)e~ exp{i Sp £»„£>*}. 
Since SpD„D* SpDD* it is sufficient to show that the limit 

limdet(/+[/„)e-®P''". (10) 

n~^co 

exists. Set U = D D* + . 

It follows from condition Sp(D — D„) (D — Dj)* 0 that Sp((7— f/„) x 
x{U— (7„)* -»0 also. Denote by the eigenvalues of the operator 

U„ in {U„ maps into and its orthogonal complement into zero); 
next, denote by /r, the eigenvectors corresponding to (it is 

assumed that 4”^ are ordered according to their absolute value). It follows 
from relation 

Z \ufr-^fVn^^sp(u-u„y-^o 

i= 1 
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that where are the eigenvectors of the operator U 

and /Ij are the corresponding eigenvalues. Next we have 

" (n) 

det(/+(7„)c~^P^-= n = 

k= 1 

=( ^11 (1+4"*) 

Since 

m m 

lim n (l+Al>")e-^l">=n (l+4)e-^\ 

n^co k=l k=l 

to prove the existence of the limit (10) it is sufficient to show that 



lim lim X (W = 0. 



m-^oo n-*ao m+ 1 



However 



lim lim ^ (4”>)^= lim lim SpU^~Y, (4"’)^ 

m-*oo «-^oo[_ fc=l 



m-^oo n-^oo k = m+l 



= lim Sp (7^ — ^ Aj 
m->c»L k=l 



= 0 . 



The existence of limit (10) is thus established. Moreover, this limit is 
different from zero since 



K= Yl 



k=l 



and since 1 +7,;,^0 in view of the invertability of the operator (1 +D) x 
X (1 -f-D*). We now prove the existence of limit (9). Let 



W„(x) = 



\D„B-^P„x\^ + SpD, 



n * 



We then have, in view of formula (7) in Section 6 of Chapter 5: 

iW„ix)-W„{x)T p{dx)= 



D„ + ^D*D„-D„-^D*D„} B ^x, B ^x) + 
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+ Sp(D„ — Dj 



-0 



ti{dx)^\-Sp{D*D„-DM 



, ^ fD„-D^ + D*-DZ , D*D„-D*D^, ^ 
+ Sp ; 1 1 € 



-Sp(D*D„-DM 



\Up(U„-Uj^; 



the last expression approaches zero as n-^oo and m->oo. The existence 
of limit (9) is thus established. 

We now proceed to the proof of formula (7). Let the measure v„ be 
defined by equation v„ {A) = ili{T~^ A), where T~^ It then 

follows from Theorem 3 that 



dv, 



djii 



”(x) = |det5(/ + D„)B-'|x 



X exp 



-z 

k=i L 



-{D„B-^x,e,){x,e,) ^\D„B~^P„x\^ 






Pi 



= |det(/ + D„)| exp<; -{D„B-^P„x, B~^ P„x)~- \D„B~^ 



since |det5(l +D„) B~^\ coincides with the absolute value of the deter- 
minant of the transformation matrix B{l-\-D„) B~ ^ considered in and 
written in the orthonormal basis, it follows that 

|det5(/ + D„)B-^| = ldet(/ + D„)|. 

Hence 

^{x) = K„cxp{W„{x)}, 
dfx 

where W^(x) is defined above and 

K„ = \dct(I + D„)\ 

As we have shown 

dv 

lim — - (x) = K exp { lT(x)} 

n^oo dfi 

in the sense of convergence in measure fi. Now let measure v„ be defined 
by the equality v„ = fi{f~^{A)), where f„=l + BP„CP„B~^. It can be 
shown analogously to the above, that 
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in measure jU, where K and W are defined by formulas (8) and (9) with 
T replaced by T~^ and by TB^T^, 

Utilizing now Corollary 2 of Theorem 3 in Section 1 we obtain the 
proof of the theorem. □ 



§4. Absolute Continuity of Gaussian Measures in a Hilbert Space 

Let two Gaussian measures and fi 2 with mean values and a 2 and 
correlation operators B^ and B 2 correspondingly be defined in the Hilbert 
space (^, S). Below we establish necessary and sufficient conditions on 
^ 1 ,^ 2 , B^ and B 2 for the measure 112 to be absolutely continuous with 
respect to It will be shown that the density diX 2 ldfii is everywhere 
positive so that the absolute continuity of ^2 with respect to implies 
the equivalence of measures and 1 x 2 . Moreover, it turns out that the 
violation of the condition of absolute continuity implies the orthogonality 
of the measures, so that two Gaussian measures are either equivalent or 
orthogonal. 

The case when the measures have different location parameters but 
B^ =B 2 was partially studied in Section 2. 

Theorem 1. If B^=B 2 = B, then in order that /X 2 «/ii it is necessary and 
sufficient that a 2 —aiEB^^^^; moreover 

^(x) = exp{(5“^'^(x-ai), (1) 

dni 

If a 2 — a^^B^'^3C , then _L jU 2 . 

Proof. The first assertion and formula (1) were verified in Section 2. 
Now let 

—— (a2~ai, = otfc? 

V K 

where are the eigenvalues of the operator B and ej, are the corresponding 

QO 

eigenvectors. If 02 —a^^B^'^SC then = + 00. Consider the function 

k= 1 

9n{x) = ('t c(l) 

\/£=l / t=i ^ 2 .^ 



Since 



g„{x) tifdx) = 0, Gni^) i^ 2 {dx) = l. 
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j* 9n{x) /ii(dx) = j* (g„{x)-lf fi2{dx) = (^^ 



- 1 



it follows that Qn^O in measure and that g„^l in measure /T2. This 
implies the orthogonality of measures and ^2- The theorem is 
proved. □ 

Consider the case when a^=a2 = 0 . Assume that Then the 

ratio {B2Z, z)/{B^z, z) is necessarily bounded for ze^. Indeed if one can 
find a sequence z„ such that 



Z„) 



= +CO, 



then in 

J {BlZn, ^n) 



measure , 



/^2 



x: 



\/ (^2 J / 






dt^ 



hence 



(zm x) 



|t |^6 



2 e 



s/Wi 



does not tend to zero in measure ^2- This contradicts 



the absolute continuity of ^2 with respect to It is easy to see that the 
(B2Z, z) 

implies that measures and fi2 (even) 



unboundedness of- 



(Bjz, z) 

singular. Hence this ratio is bounded from below by a positive number. 
It follows from here that the ranges of values of operators B\ and B\ 
coincide and that the operators C = B| ^2^ C~^=B\Bi^ are 

bounded. Note that the boundedness of the operator C implies that 
(z, B2^x) is a measurable functional not only in measure ^2 (cf. Chapter V, 
Section 6 ) but in measure as well. This is because 



(z, B2 ^x) = (z, C*Bi ^x) = (Cz, Bi ^x). 



Consider the self-adjoint operator C = B2^ B^B^^. We show that 
C*C = / + D, where D is a completely continuous operator. To do this, 
it is sufficient to verify that if is the resolution of the identity for the 
operator Z), then the projectors for >l <0 and / — for A >0 map ^ 
into a finite-dimensional subspace. We first show that there is no eigen- 
value X^Ofor the operator D such that an infinitely dimensional proper 
subspace will correspond to this eigenvalue. Otherwise an infinite ortho- 

1 ” 

normal sequence z^ in this subspace can be found such that - V (z^, 

rik=i 



1 " _ 

^2 ^x )^ 1 in measure JU2 and - ^ (z;^, 52 ^x)^-> 1 + Ain measure /Xi, since 

n fc=i 
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on each one of the probability spaces (^, /i^) and ®, jU2) the vari- 

ables (zfc, B2^x) form a sequence of independent Gaussian variables with 
means 0 and variances 1 -h A and 1 correspondingly. 

Indeed 



J ^2 {Zp B2 ^X) (dx) = (Zfc, Zj) = dkj , 

I (Zk, Bpx) (zj, Bpx) /ti (dx) = {BiBpz^, BpZj) = 

= i^k, Zj) + {Dz^, Zj) = (1 + /I) • 

The fact that 



- E {Zk,B2^x) 

^ fc= 1 



converges in measures and 1H2 to different constants, implies the ortho- 
gonality of these measures. Now let 

where 0 >A = Ao>'^i > ••• > ^ are non-empty subspaces. 

Then (z^, B^^x) are, as before, independent Gaussian variables on the 
probability spaces (^, ®, //i) and (^, ©, ^12). 

Indeed, 

I (Zfc, Bpx) (zj, Bpx) jU2(dx) = Stj, 

I (Zk, Bpx) (zj, Bpx) fij^{dx) = 3kj + (Dzi„ z^ = 

Alt - 1 

A/c 



Using the strong law of large numbers we can assert that 
1 " 

- Z (^fc’ ^2 ^xY~^\ in measure 1x2 and that 
nk=i 

Afc- 1 

lim - E (Zk, Bpxf < lim - E ( 1 + 

«-*‘oonfc=i n-^co n k=i\ 

2k 



MiE^Zk, Zfc) kl+/l<l 



in measure jxi. From these two relations it again follows that the measure 
and ^2 singular. To complete the proof one need merely note that 
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for X<0 the subspace is infinitely dimensional in the case when 
either an infinitely dimensional proper subspace corresponds to some 
2<0 or a finite number of disjoint intervals on ( — cx), X) exists such that 
the increment is non-zero on each one of them. We have thus shown 
thatE^.^ is finite dimensional for A<0. In the same manner one can 
show that (/ — is finite-dimensional for A > 0. 

Thus the operator D is completely continuous. Let ^ 2 , ... be the 
eigenvectors of the operator D and let <52,... be the corresponding 

eigenvalues. We now show that the absolute continuity of fi 2 with respect 

00 

/1 1 implies that ^ dl<co. 

k= 1 

00 

Indeed if ^ 6^= +oo, then we take the sequence of functions 

k=i 

\k=l / k=l 

We have already noted that if vectors belong to distinct orthogonal 
proper subspaces of the operator D, then the variables (z;^, B^^x) are 
independent and Gaussian on the probability spaces (^, S, /^i) and 
and ®, PL 2 ). It follows from the relations 



Gnix) fl2(dx) = 0 , 

g^{x) fi2{dx) = (Y. ) E J 

-2(e^, B 2 ^xf+l^ H 2 (dx) = 2( Y, 3l) , 

Bi{dx) = [ Y ^k] Y d),(De^,e^)=l, 



.k=l 



\k=l 



X Y 

k=l 



K, B 2 '^x)^ Hi{dx)~ 



(Cfc, B 2 ^xf pi (dx) 



-2 n 



=2 E <5n E E 



1 



Jk= 1 



that Qnix) 0 in measure fX 2 Qn{^) ^ 1 in measure Hence the 

00 

condition Z = + ^ implies the orthogonality of the measures and 

k= 1 
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li2. Another necessary condition, satisfied by < 5 ^, follows from relation 

1+^ , gfc) e^) (BiZ„, z„) 

„^oo(B,z„, z„) 

where z„ is a sequence of vectors in such that Hence, 

dk> — 1 . 

00 

Now let dk> — I and ^ dl<co. We show that the measures and 

fe= 1 

fi2 are equivalent. To do this consider the measure jl defined by the 
equality 

i“(^) = | Q{x)fli{dx), 

A 

where 

e(x) = exp|-i^f ^{B2^x, e^Y — ln(l + ^fc)j|. 

The convergence of the series 

Z I " (^2 ln(l + 6 ^ 

k=l L 1 +^k 

in measure follows from the fact that on the probability space 
(^, ®, iUi) this is a series in independent random variables such that the 
corresponding series of mathematical expectations and variances given by 



are convergent. 

We find the characteristic functional of measure jl: 







For any ze^ the relation 



(z, x) = (B|z, B2 *x)= f; 



k=l 



{B^z, (Bj *x, e^) 



is satisfied where the series on the right is convergent /Zj -almost every- 
where. 
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Utilizing the fact that the variables CfJ on the probability 

space (^, jUi) are independent and Gaussian with mean value 0 and 
variances l+d^wc obtain 



x(z) = Eexp<i X {B\z,ek){B2^x,ek)- 



k= 1 

i Z ["(^2 TTl — *n(l + (5fc) 

fc=i L 



1 



= n E ^k) {Bi ^k)- 

k= 1 

(5. 



2(1+5, 







= n exp{-i(B|z, efcf}=exp{-i(B2Z, z)}. 

k=l 



Since the characteristic functional of measure p, coincides with the 
characteristic functional of measure /^2 we have ^2 = i^ = q(^)- 



Thus the following theorem is established: 



Theorem 2. Let ii^ and fi 2 Gaussian measures with mean values 0 

and correlation operators Bj^, k=\,2. In order that measures pi and P 2 
be equivalent, it is necessary and sufficient that the operator D = B 2 ^^^ x 
X B^B 2 ^'^ — Ibe a Hilbert- Schmidt operator and its eigenvalues 5], satisfy 
the inequality d> — If this condition is violated then the measures and 
P 2 are orthogonal. In the case when measures and P 2 are equivalent the 
following formula 



dpi 

dpi 



(x) = exp 



00 r 

•iZ I 

k=l _ 



{B2^x,ey 



l+5t 






(2) 



is valid where ej, are the eigenvectors of the operator D corresponding to 
the eigenvalues 5^. 

Remark. Let p^ and P 2 be two measures as defined in Theorem 2. Denote 

by and the Hilbert spaces of linear measurable functionals with 
respect to measures p^ and p 2 (cf. Section 6, Chapter V). If a sequence of 
functionals {4(x), A:= 1, 2, ...} exists which belongs to both spaces and 
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which is a complete orthogonal system in each one of these spaces and 



4 '-'= 



Uk(x)T Mdx), 






then under the condition that > 0 , X 




2 

I <ooandmore- 



dlii 



W = expU X 



fc=i L 



ll{x) 



1 



1 

'W\ 



-In 






The proof of this assertion is completely analogous to the proof of the 
sufficiency of the conditions of Theorem 2 . □ 

Now consider the general case: We introduce in addition to measures 
fii and /^2 ^ measure with mean value and correlation operator 
^2. We show that the condition implies relation so 

that 



djU 2 _ dfi 2 dfi ^2 , 
dfii dfi^2 dfi^ 

and one can also utilize formulas ( 1 ) and ( 2 ) for computing diii2/dfii2 
du^2/dux respectively. It is sufficient to show that since in this 

case hence 112^1^12- If M2 is a measure with mean fl2~^i 

and correlation operator B2 and is a measure with mean 0 and corre- 
lation operator then dhc measure jlf be defined by 

relation fif{A) = fLi{{x: — xg^}). Clearly = Therefore and 

hence /i2 * M2 Mi * Mi- It is easy to see that fi2 * /i* is a Gaussian measure 
with mean 0 and correlation operator 2J?2, and that the measure Mi *Mi 
has the same mean but a different correlation operator (which equals 
2 Bi). Consequently, V2 Vi where (^= 1 ? 2 ) is a Gaussian measure with 
mean 0 and correlation operator B^. But then Mi 2 ^ Mi ^iso since fi^2 
are obtained from measures V2 and Vi by the translation in the amount 
a^. Thus the following theorem is valid in the general case: 

Theorem 3 . If the measures and 1X2 are two Gaussian measures with 
characteristic functionals 

(Pt(z) = exp{/(flt, z)-i( 5 fcZ, z)], k=\, 2 , 

then in order that the measures and fX2 be equivalent it is necessary and 
sufficient that the following conditions be satisfied: 

1 ) a2— ai=B]j^b, where beSI. 

2 ) the operator D = B2^^^BiB2^^^ — I is a Hilbert-Schmidt operator 
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and its eigenvalues satisfy the inequality d^> — If at least one of the 

conditions is not satisfied the measures and pi 2 are orthogonal. The 
following formula 



dUi 

dni 



(x)=exp 



00 

-i E 

_fc=i 






1+5, 



-ln(l + 5,) 



+ 



+ {B2^(x-ai),b)-^\b\^\ (3) 



is valid for equivalent measures, where e^ are the eigenvectors of operator 
D corresponding to the eigenvalues 

We shall consider some sufficient conditions for absolute continuity 
of Gaussian measures. The conditions presented below may turn out to 
be more convenient for applications, since they don’t involve fractional 
powers of correlation operators. Assume that the operators and 

B 2 B^^ are bounded. Set V=B^B 2 ^ —I. Since V=^By^DB 2 ^'^ then 
= orthonormal sequence of eigenvectors 

of operator ^ 2 - Then 

00 / 1 \ 

SpD^=X (D%f,)=Z = 

^=1 *=iV ^ / 

= f {D^B2%B2%)=t (V%A), k = 

fc=l k=l 

Hence SpZ)^<oo provided only that SpF^ is finite. Since is an 
asymmetric operator the verification of conditions for the existence of 
SpF^ may be complicated. However, utilizing the equality 
00 00 00 

E \{v\, e,)\= X V*e,)\^ X \Ve,\ |F*e,|< 

k=l k=l k=l 



<JE \Vek\^-t I v*e,\^ = 7Sp F* F • Sp FF* = Sp F* F 

V k=l k=l 

(recall that SpF*F=SpFF*) one can formulate the condition for abso- 
lute continuity in terms of the trace (spur) of a symmetric nonnegative 
operator V*V. 

Theorem 4. Let and H 2 be Gaussian measures with mean 0 and correla- 
tion operators B^ and B 2 . If bounded operator V exists satisfying the 
relations 



VB2=B,-B2, 



Sp F*F<00, 
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and if — I does not belong to the spectrum of this operator then 

Proof. It is sufficient to show that in the case when /+ V is invertible, 
i.e..02^r^ is bounded, then Let ^^= — 1 for some m. Then 

putting z = B\l^e^ we have 

(/+ K) z = z + ^'^By^e^ = z - By^e^ = 0 , 

i.e. — 1 is an eigenvalue of operator V which is impossible by the assump- 
tion of the theorem. □ 

We note another simple formula for the density of one Gaussian 
measure with respect to another in the case when the means are zero. This 
formula is meaningful under certain additional restrictions, but it is more 
convenient since it does not involve eigenvectors and eigenvalues of the 
operator D: 



Remark. If the conditions of Theorem 4 are satisfied and Sp V is defined 
(i.e. the series ^ {Ve^, ej,) is convergent in any orthornormal basis) then 
the following formula 

^(x) = Vdet(J+F) exp | - ^ Vx, x) | (4) 



is valid. In view of the results presented in Section 6 of Chapter V, the 
quadratic functional (B[ ^ Vx, x) is measurable with respect to p since 
SpK exists and SpF*K<oo. To prove formula (4) we note that the 
existence of Sp V implies the existence of SpD and hence the convergence 
of the series 



z 



k= 1 



z 



k=l 






00 



z 



(Bj ^x, 



1 +(5^ 



Let P„ be the projector on the subspace spanned by e^,..., e„. Then 



00 



z 






1 + 



00 

Z (Bi^x, e^f-(B2^x, 

k=l _ 



1 

1 



00 



= z 



[(^2 e^y~(B2 "X, fifc) (Bz B|Bi = 



= Z L(B2* 

k=l 



X, 



ekf-(B2 e^) (B|Bi ‘x, = 



= lim X i{Pn B 2 ^x, -(F„Bz-ix, e,) (B|Br 'x, e*)] = 

n-*oo k= 1 

= lim[(P„B2*x, B2 *x)-(P„B2^x, B|Bf^x)] = 

n-*ao 



= ((B2 ‘-Bi ^)x, x) = (Bi ^Vx,x). 
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Furthermore, 



X; log(H-^,) = log|det(/ + £»)i= log|det(/ + B|DB 2 i)| = log|det(/+F)i. 

k= 1 



00 

Substituting the derived expressions for ^ 

k= 1 



1 + ^k 



and 



^ log(l 4- dk) in (2) we obtain formula (4). 

k= 1 



§5. Equivalence and Orthogonality of Measures Associated with 
Stationary Gaussian Processes 

Consider two real stationary Gaussian processes ^^{t) and (^2(0 ^^e 

interval [ — F, F] . Associated with these processes are Gaussian measures 
jUi and ii 2 on the space if 2 T\ of all functions x{t) which are square 
integrable on [— T, F]. It is more convenient to consider the space of 
complex-valued functions with the scalar product 

T 

j* x(t)ylt)dt. 

-T 

Let E^j{t) = aj(t), and Rj{t) be the correlation function of the process 
Then aj{') is the mean value of the measure fij and its correlation 
operator Bj is defined by the equation 

T T 

Rj{t — s) x(t) y(s) dt ds. 

-T -T 

The purpose of this section is to study the conditions for equivalence and 
orthogonality of measures and fi 2 of this special type. 

Denote by Fj{X) the spectral function of the process 

R,(r) = | e‘^‘dFj(X). 

Let the process ^j{t) possess the following spectral representation 

^j{t) = aj{t)+j e‘^‘ dyj{X), (1) 

where yj{X) is the complex-valued Gaussian process with non-correlated 
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increments for which 

^\yji^2)-yMir=\F{X2)-F(x,)\. 

Henceforth we shall make use of the space of functions g{X) ad- 
mitting representation 

T 

j* e''^(p{t)dt, 

-T 

where (p{-)e^ 2 L — T, T~\. The s>pace coincides with the space of the 
entire analytic functions of the exponential type not higher than T and 
square integrable on the real line. In what follows we shall consider 
functions belonging to the real line. Denote by the 

closure of iTj in the metric 

ll0llf,=J \d(^TdFi{X). 

The space is the Hilbert space with the scalar product 

(0i,02)F. = j* gi{^)92i^)dFi{X). 

First we investigate the condition of equivalence and orthogonality of 
measures which correspond to processes with the same R(t) and 
different mean values. 

Let Ri{t) = R 2 (t), a^{t) = 0, a 2 ( 1 ) — a (t). 

Theorem 1. In order that measures fi^ and ji 2 be equivalent, it is necessary 
and sufficient that the function a{t) admit representation 

a(t)=^ e-‘^'b{X)dF^(X) (2) 

for /e[— r, r] where b{?)e Wt(F^. If this condition is satisfied then 

^(^i(-)) = exp|| b{X)dyi(X)-^ ^ \b{X)\^ dFi(X)^; (3) 

here yi(f) is the function appearing in the spectral representation of 
(r) as given by formula (1). 

Proof First note that Pi'^P 2 - As it follows from Theorem 1 of Section 4 
^(x)=exp{/(x)-c}, 

dfii 
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where l{x) is a measurable linear functional in measure and c is a 
constant. It was established in Section 6 of Chapter V that every measur- 
able linear functional /(^i(*)) of a stationary Gaussian process ^^{t) on 
[ — r, r] can be represented in the form 



iiU))= 



b{X)dyi{X), 



where b{X)EiTT{Fi). To find the relation between a{t), b{X) and the con- 
stant c we write the characteristic function of the variable (^ 2(0 
0: 

exp{ia(t)z-iz^i?i(0)} = E = E ((^i(-))= 

dfii 






= Eexp< [b(X)-\-ize'^^^^ dyi{X) — c 



We now utilize formula E e^ = exp for E^ = 0, which is valid for any 

Gaussian variable (including a complex-valued one). Note that since 
^(t) is real, it follows that dy{X) = dy( — X) and dF{ — X) = dF{X); therefore 



E 



[b{X) + iz dy^{X) 



2 



= E [b (2) -f iz dy^ (A) J [b (A) + iz dyi{ — X) = 

= E J [b{X) + iz dyi{X) ^ [b( — X) + iz dyi{X) = 

= J [fc (A) -f iz [b{-X)-\~ize~ dF^ (A) = 



b(X)b{ — X) dFi (A) + 2iz 



b{X)e-^^^UF{X)-z^R,{0). 



Finally since /(^i(*)) is real it follows that 

j b{X)dy,iX) = ^W)d^ = j^=^dy,(X). 

Therefore b (X) = b{ — X), b{ — X) — b{X) and 



I 



b{X)b{-X)dFi(l)= 



\b(xrdF,(X). 
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Thus 



exp{iza{t)-j Ri(0) z^} = exp|-c+j J \b(X)\^ dFi(X) + 

e~‘^‘ b{X) dF,(X)-^z^Ri(0) 



+ iz 



and hence 



c 



\b{xrdF,{X), 




b{X) dF^{X). 



We have thus established the necessity of the conditions of the theorem 
and have verified formula (3). 

We now proceed to the proof of the sufficiency of the conditions of the 
theorem. 

Let formula (2) be satisfied Introduce measure jl which is absolutely 
continuous with respect to measure and having density djl/dfi^ which 
coincides with the right-hand-side of equality (3)., We show that measures 
jLi 2 and fl coincide. For this purpose we compare their characteristic func- 
tionals (the characteristic functionals of measures fi and fii will be denoted 
by X and Xi respectively). Clearly, 



i 1 1 

;( 2 (z) = exp|j J a(t)z{t)dt-i J 



R{t — s) z(t) z(s) dt ds 



-T 



Next, 

T 

x(z)=Eexp|j J z(t) = 

- T 



= E exp 



b{X) + i z(t)e'^^dt 



dyi{X)- 



1 

2 





\b{xrdF,iX)+ 



+iE(^| b{X) + i j 



z{t) dt 



-T 
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expi:r z(f) 



[b (A) e-‘'^ + b(-X) dF^ (A) dt + 



~T 

T 



+ 2 



j* z (t) J ^ (■*) ^ W \ = X2 (2) • 

-T -T 



Since X2 = L it follows that /i2 = A- The theorem is thus proved. □ 

Corollary. It follows from Theorem 1 that if for some T the measures ii[ 
and fil, associated with the processes ^^{t) and ^\{t)-\-a{t) on the interval 
[— r, r], are equivalent then an extension of the functional a(t) to the 
whole line always exists such that measures and P2 corresponding to 
these processes on ( — 00, 00) are also equivalent. The right-hand-side of 
equality ( 2 ) defined for all values of t may serve as such an extension. 

Assume that the spectral function {X) possesses spectral density 
/i (A). Let (/) be the above stated extension of a{t) for which the mea- 
sures juf* and P2 ^re equivalent. If a{X) is the Fourier transform of the 
function a"^ (?), then 

d{}) = lnb{X)f,{X). 

Therefore in order that the measures pi and p2 be equivalent ( these are the 
initial measures discussed in Theorem \), it is necessary and sufficient that 
a continuation of the function a{t) exist on {—00, 00) such that the Fourier 
transform of this extension d{X) will satisfy relation 






\ayf 

/iW 



dX< 00. 



The function 



d{X) 

2n/i(A) 



can be taken in place of b{X) 



Consider now processes ^^{t) with the zero mean value and unequal 
correlation functions (r) and i? 2 ( 0 - Denote by Wj the space of func- 
tions b{oL, P) admitting representation in the form 



T T 

b(a,^)=| I 

-T -T 






where q> is the function which is square integrable on [ — T, T] x [ — T, T]. 
Denote by the closure of in the metric generated by the 
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scalar product 



{bi,b2) = 




b,{a,p)b2(a,P)dF,(a)dF,(p). 



Theorem 2. If E ij{t) = 0,J= 1, 2, then in order that the measures and 
fi 2 be equivalent it is necessary and sufficient that a function b (a, jS) e [F^ 

exist such that the representation 






b{a,P)dF,{a)dF2{P) 



be valid. Moreover 



dp 

dp 



- (^ 1 ( • )) = exp I J J ^ (a, j8) dv 1 (a) ‘iy 1 ()5) + c| , 



( 4 ) 

( 5 ) 



where the function p) is connected with b{oc, P) by relation 



<&(a, ]8) b{p, 7) dF, ()5) = fe(a, y)-^(a, y), (6) 



c = — In E exp 






^{cc, P) dyi{o^) dy^(P) 



( 7 ) 



Proof Necessity. Assume that Pi^P 2 - Then the spaces of linear mea- 
surable functionals in measures and p 2 coincide: i.e. ^{p^) = ^{p 2 ) 
(see Section 6, Chapter V, re linear measurable functionals). As has al- 
ready been mentioned, for a given stationary Gaussian process ^j[t) 
with E^j{t) = 0 every measurable linear functional l{^j) can be represented 
in the form 

= | g{cc)dyj(a). 



where geiF'T{Fj). In the course of the proof of Theorem 2 in Section 4 
a sequence of measurable functionals was constructed which forms a 
complete orthogonal system in ^{pi) and ^{pz) simultaneously (these 
are the functionals ek) = lk{x) where e^ are the eigenvectors of the 

operator D). 

Let 






9k(<x) dyj(a). 



It follows from the orthogonality of 4 with respect to measures pi and 
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fi 2 that 



0= E J 6fic(a) dyj(a) g„(a) dyj{a.)= J g^{a.) g^{a) dFj{cc), ki^m. 



We normalize so that 



I 



l0t(a)l^ dFi{(x)=l, 



and moreover, let 



|gffc(a)|^ rff2(a)=l+Ck- 



It follows from Theorem 2 of Section 4 that ^ cl <co. We set 



H<^^P)= E dkif^) gS) 

k= 1 

and show that h(a, jS) satisfies relation (4). Consider the function 



s) = 



% 



, — iat + ips 



b (a, jS) dF^ (a) dF^{P)-\- R^{t- s)- R2{t- s). 



If z(a) = 



£ <p(f) df, then 



T T 

J J s) (p{t) (p{s) dt ds = 

-T -T 

= J z(a)z{P)b{(x, p)dFi(a)dFi{P) + 

+ J* {dFi(a)-dF2{a)) = 

00 r 2 

= Z Cd z{ix)g^{a)dFi{(x) - 
k= 1 J 

- 1 ^Z 0ic(a) I gk(P) ^iP) dFi iP) 

= ^Z l0k(«)l^ dF 2{ct) 



z(a) gft(a)rfFi(a)j 
dF 2 (a) = 



z(a)gf*(a)rfFi(a) 



= 0 . 
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Utilizing the equality s) = il/(s, t) we verify that s) = 0. The ne- 

cessity part of the theorem is thus proved. 

Sufficiency. We now proceed to prove the sufficiency of the conditions 
of the theorem and the derivation of formula (5). Assume that a function 
b{', exists which satisfies relation (4). Consider the integral 

operator 



yg{P)= b{a,p)g((x)dF^(a). 



If b{', this operator maps iF'(Fi) into iF{Fi). This is easy 

to verify by noting that 

VgGiFT, if b[',')Gir^ 



for any bounded function g and that 




\b(a, p)\^ dFi{a)dFi{P). 



Thus the operator U is a bounded self-adjoined operator on iF'riFi)- 
Being an integral operator with a square integral kernel also, it is there- 
fore a completely continuous operator and a Hilbert-Schmidt operator. 
Denote by ^^(a) the complete orthonormal sequence of eigenfunctions 
of the operator V and by the corresponding eigenvalues. Then 

E K 0k(a) 9k{P)^ E 

k=\ k=l 



By construction the functions g^{a) are orthogonal in iF'j{F^). We now 
show that they are also orthogonal in iF'j(F 2 ). Let (pl{t) be a sequence 
of functions in J ^2 [ ~ L; T] such that 

T 

-T 



as n-^GC in the sense of convergence in iFj{F^). Then 



T T 



(* f* 

R 2 {t — s) cpl {t) q>] (5) dt ds — 
J 



Ri{t — s) (pl{t) (p'jis) dt ds = 



-T -T 
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b{a,p) 



1 



Approaching the limit in this equality as n-^oo we obtain: 



9k(^) 9j{c() dF 2 (a)- 



9ki^)9j{(^)dF2((x) = 

= j* J ^ (a, )5) gj (a) (P) dF^ (a) dF^ {P) . 



Since for k^j 



J I P) 9j{c() 9k(P) dF^ (a) dF ^ (P) = 0, 



it follows that the following equality 



9k{<^) 9j{a) dF2(a)-- 



9ki<x)9j{<^)dFii<x) = 0 



is valid for k^j. 

Thus the sequence ^^(a) is orthogonal in the space i^ri^i) well 
as in the space #"7(^2). It follows from formula (4) that b(a, can be 
chosen in such a manner that b{a, ji) = b{ — a, ~P); in this case we may 
take ^fc(a) satisfying ^^(a) = ^fc( — a). Under this condition, the functionals 

gk{(x) dyj(oc) are real linear functionals on the processes In view 

% 

of the remark after Theorem 2 in Section 4 we have iii'^ ii 2 and moreover 



dfi2 

dfii 



(^i(-)) = exp 




1 



gk{oc)dy^{a) 



— ln(l +2fc) 



Next note that in view of formula (15) in Section 6 of Chapter V the 
following equality 



00 1 

V 

k= 1 I +'^k 
is valid, where 



9k{<^)dyi(ct)\ 



a, P) dyi (a) dyi 



iP) 






0(a, P)= X 9k{<^) 9kiP)- 

k=l t +/lfc 
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To complete the proof, one need merely note that 



I 



00 ^2 

0{x,y)b(y,P)dFi{y)= ^ g^{cc) g^{P) = b{a, P)-^{ci, P). 

k=l A + % 



The theorem is thus proved. □ 

Assuming the existence of spectral densities fj{X) = — Fj{X) (J=l, 2), 

uA 

we now present certain sufficient conditions for the equivalence of 
measures. For this purpose an auxiliary result on orthogonal bases in 
^r(Fi) - for a special choice of - is required. 

Lemma. Let /i(2) = |(/)o(2)p where (p^eiV'^ and ^^(2) is an arbitrary 
orthonormal basis in iF'j{F^). Then 



00 



z 

r= 1 






T+s 

ji/iW 



(8) 



Proof. Since iTj is everywhere dense in ^r(^i) it is sufficient to verify 
inequality (8) for the case when gj^{X)eiF'T. Under this assumption 
Gk{^) ^o{^)^'^T-s hence 



T + s 

0/fc W <Po (>*•) = J e~'^‘ \l/t(t)dt, 

-T-s 



where (t) g if 2 [ ~ s, T+ s] . Since 

00 00 

fiffc(A) (po(X) gj{X) (po{X) dX= J g^{X) gj(X)fi{X) dk, 

— 00 — 00 



it follows from Parseval’s theorem that 

T + s 

I *Pk(t)^j{t)dt=Ls^ 

-T-s 



Hence ^k{t) form an orthonormal system of functions in 
^2L — T—s, T+s]. Therefore it follows from Bessel’s inequality that 



00 



In X 



T + S 



e il/k{t) dt\ 




T + s 



2 



\e~^^f dt = 2T+2s, 
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or 

c» 

2tc E |0*(/1)9o(2)I^<2(7’+5). 

k= 1 

The lemma is thus proved. □ 

Theorem 3. Let ii^ and be measures associated with stationary Gaussian 
processes ^j{t), possessing spectral densities f^{X) ( 7 = 1 , 2 ). If a 

function q>Q{X)eW'^ exists and constants and C 2 exist such that the 
inequality 

Cil<PoWl^^/i(2)<C2|<i()o(2)|^ (9) 

is satisfied and if moreover 

rr /2(A)-A(A) ~ 

J L /iW _ 



2 

dX<co, 



then the measures gi and P 2 equivalent for any T. 



Proof Set 



72(2) = 



fi{X) = c^\(Po(X)\^, 

[fiW; 

f/i(A)+/2(^)-/i(^); /2(^)>/i(^), 

0 ; / 2 (A)^/i(A). 



Denote by /ij ,7 = 1, 2, 3, 4, the measures which are associated with 
Gaussian stationary processes on [— 7^T] possessing the spectral 
densities fj{X) respectively. Since Pj = fLj+^*p 4 ,J=l, 2 (the spectral 
density of a sum of independent processes is equal to the sum of spectral 
densities of the summands) to prove the equivalence of measures and 
P 2 it is sufficient to show that 112^^3 or that jlj^jl^J=2, 3. The proof of 
the last assertion is the same for both cases j = 2 and 7 = 3. Denote by 
Fj{X) the spectral function having the spectral density 7 /(^)* L^t {gk{^)} 

f.n)—7 (>i) 

be an arbitrary orthonormal basis ini^ (Fi). Setting = 

jiW 

we obtain, in view of the lemma that 



fc= 1 



1 ' 



\g,{irdFj{X)-\ \g,(irdF,{X) 




510 



Chapter VII. Absolute Continuity of Measures 



= ii ^\gMrh{X)MX)dx^^ 

< I 7i W dx - 1 \g,{xr Ji W d^ = 

c °° T+s r 

= S \g,{Xrh^X)M^)dX^ h^X)d^^ 

J k=l ^ J 



< 



T’+s/c^v rr/2w-/iW" 



7T \C 



/iW 






^since|/i(A)|^^ 
'^t{F\) such that 



/2(A)- /i (A) 



/t(A) 



. Let L be a symmetric operator in 



{Vg, 0) = J l0(A)l^ rfFj(A). 



We have shown that for any orthonormal basis {Oki^)} in iT t{Fi) the 
relation 



E ([V-l]g„g,r^c 

k= 1 



is satisfied. This relation yields that F— 7 is a Hilbert-Schmidt operator. 

Let Qk be a sequence of eigenfunctions of operator F— 7 and let 
be the corresponding eigenvalues. Then the function 

OD 

Z a^ki°^)gM 

k=l 

00 

is defined and belongs to in view of the fact that ^ olI<qo. 

k= 1 

Denote by Rk the correlation function of the process with the spectral 
density ^ and denote by \I/^{X) the function e^^\\t\^T (this function 
belongs to iF'^iFi)). Then 



Rj{t-s)-Ri(t-s)= 



eiMt-s, {dFj{X)-dF,{X))=(iV-I] i/r, ./.,) = 



= Z ([^-^] 'I't, gk) (gk, 'As)= Z 9k) {gk, ^s)-- 

k=l k=l 




^‘<‘ gk(cc) dP i{cc) 



I 



e-‘^i^gS)dF,(p)= 
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P) dF,{a) dF,{P). 



To complete the proof one need merely utilize Theorem 2. □ 

Remark. Inequality (9) may be violated on a set A of finite measure such 
that 



AW T jQ ^ ^ / 1 

; — dA<oo, /c=l, 2 . 

Ll<?>oWI J 



Indeed in this case a measure can be introduced which corresponds 
to a stationary Gaussian process with the spectral density (2) which 
already satisfies inequality (9). This spectral density is defined by 

( /tW. 



/l*W = 






In this case 



/i (A)]" r r/2 (A) -ft (A)i^ 

. /i*(A) J ’ J L /i*(A) _ 



dA<oo, 



so that ^ 2 '^ fX 

We now derive sufficient conditions for orthogonality of measures Fi 
and Fi under the assumption that / 2 (A)^ /i W* First consider the case 

where fi{A) = Let the measures Fi und Fi equivalent. In view 

of Theorem 2, a function h(a, j?) exists such that 



R2{t — s) — Ri{t — s) = 



b(o^,p) 



da dp 
1 + a^ 1+P^ 



Since the function 



\b(<x,pf 



b(oc, P) 



da dp 
l+a l+P^ 



a P 
l + a^ l + P^ 



is square integrable, the derivative 



[/? 2 (t-s)-Ri(t-s)] 
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exists and moreover, 



dt ds 



[i^2 ^ 



i(f-s)] = | 



, fat + i^s — P — jj p) da dp. 



1 + a^ 1 + r 



Setting R{t) = R 2 (t)-Ri(t) we obtain that 



1 

I 



2T 



[_R"{t — s)Y dt ds=^ 



[R"{t)y \2T-t\dt<co. 



-T -T 

Utilizing relation 

2T 

J n?"(tv \2 



-2T 



[R"{t)Y\2T-t\ dt = 



-2T 



sin^ T(a — jS) 



[/ 2 (a)-/i (a)] [f2{P)-fi {PJ] dec dp, 



— 00 — 00 



and also equality - = 1 -h we find that in the case when/i (A) = - — 

JiW l + A 



the equivalence of measures yields the inequality 



00 a 

IJ 



sin^ T{a-P) / 2 (a)-/i(a) f 2 {P)-MP) 



(a-i?r /i(a) MP) 

— 00 — 00 

Hence if for some T>0 



doL dp < CO. 



sin^ T(a-P) / 2 (a)-/i(<x) f 2 (P)-fi{P) 

/iW AiP) 



da dp= + CO, 



then the measures and fii associated with stationary processes on 

[ — r, r] will be orthogonal ^provided 

Note that all the previous arguments remain valid if instead of the 
1 

1 -Ta^ 



requirement /i(A) = - ^ the inequality 






1+1 



i+i^ 
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is satisfied for some and C 2 - 

Now let (Po(^) be an entire analytic function of exponential type at 
most s, such that 



I 



dX 

(1 +>l^) \cpo{X)\^ 



Let the spectral density (A) satisfy the inequality 



(1 +/l^) \(po{X)\^ 

for some and C 2 - 

Consider the processes 



^ fi ^ 






( 1 +A^) \(Po{^f 



lj(0= j* <Poi^)dyj(^), 

— 00 

where yj{X) is defined in representation (1) (p. 499). It is easy to verify 

s 

that Therefore taking a sequence h^{u)e^^^du, 

% 

— s 

convergent to (po(^) in we obtain 

s s 

|j(t) = lim|* j* h„(u) du dyj(X) = \im J ^j(t-yu) h„{u) du. 

— s —s 

Thus the values of the process ^j{t) on [ — T— s, T+s] determine the 
process ij{t) on [— T, T]. The spectral density of the process ^j{t) is 

equal to J‘j{X)=fj(X) |<PoWl^ so that 

fi-fi h-h 

Since = — — , it follows from the above that the measures jUj 

/i /i 

and are orthogonal provided 



sin^T(a-^) / 2 (a)-/! (a) f 2 {P)-fi{P) 






/iW MP) 



doc dp = + 00 . (10) 



However in this case the measures /i^ and fi 2 * will also be orthogonal. 



We assume that fij are associated with processes ^j(t) on the interval [ — T—s, T+^J. 
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Finally note that the function (1 + U) (Po{^) is also an entire function of 
exponential type at most s. We have thus proved the following 

Theorem 4. If and measures associated with stationary Gaussian 
processes on [— T, T] with spectral densities 2 and then mean 

values 0 and an entire analytic function, of exponential type at most 
s<T, (Po{X) exists such that for some Ci>0 and C2>0 the inequality 
/i W ^2 satisfied, then the relation 




sin^ (T-s){cc-P) f 2 (oe) - /i (a) fj jP) - A jP) 
{cc-pf /i(a) fi(P) 



da dp = CO 



implies orthogonality of measures and P 2 - 

Remark. The function 

where a > 0 is arbitrary and m> a + 1, satisfies 

o<ini(\MV rnip)<™p(i'»oWf 

The function (po{^) is entire and of the exponential type not exceeding 
ms. Functions of this kind may be utilized for verifying the conditions 
of Theorem 3 and 4. 

Corollary. If /i(2) and f 2 {^ cire bi-linear (rational) functions then the 
condition 



lin. ^=1 

A^oc/iW 



is necessary and sufficient for the equivalence of measures and P 2 - 

Proof If /i(2)>0 then the corresponding condition on /i(2) stated in 
Theorem 3 as well as in Theorem 4, is satisfied with functions (pQ (A) of 
the form stated in the previous remark. If 






then 

JiW 



and one can utilize Theorem 3. If this condition is violated, one can apply 
Theorem 4. If /i(2) vanishes one may use f*{X) (in place of /i(2)) such 
that /i* (2) > 0 and lim {X }/ /f (2) = 1 . 

CO 
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§6. General Properties of Densities of 

Measures Associated with Markov Processes 

Let two random procecesses ^^{t) and (^2(0 the values in a certain 
space ^ with a tj-algebra of measurable sets ^ be defined on a certain 
numerical set T. Denote by in] the measures associated with random 
processes ^i{t) defined on the cr-algebra generated by the cylinders 
over ( — 00 , t)n r. The measures will be denoted by ju,-. Assume that 

Ii 2 «l^i and Then we also have iil«fi\ for all t 

a II I 

and moreover 

du^ 

dill 

where the conditional mathematical expectation is taken with respect to 
the probability space ii^} where is the space of all 

functions on T with values in It is easily seen that the process gj 
is a martingale satisfying = 1 . On the other hand, any nonnegative 
martingale g J satisfying = 1 can serve as the density ii\ with 

respect to ii\ for some pair of processes ^ i ( * ) and <^2 ( ' )• This result which 
is valid for any process is of little interest. More interesting results are 
obtained if it is assumed that (r) and ^2 (0 processes belonging to a 
certain more restricted class of processes. In this section the ease when 
both processes are Markovian is discussed. 

Let ^i(t) and <^ 2 (t) be Markov processes defined on the interval [a, h] 
and taking on values in a separable metric space (^, 91) (91 is a a-algebra 
of Borel sets). Denote by space of all functions with values in 

^ defined on [a, jS] and by the tj-algebra of the subsets of i^[a,/?] 
generated by the cylindrical sets. Let ii\ [«,/?] (^) be the measure on 
constructed from the transition probabilities of the process ^^{t) given 
^i(a)=x. Since the process is Markovian it follows that for cylindrical 
sets A of the form where and A^ are cylinders in 

and b] respectively and A 2 is a cylinder in (i.e. a set of the 
form {x('):x(c)eE}) the following formula 

M[a, b\ (^) = /^[a, c] 1 i ^y) , [c, b] 3) ? ( 1 ) 

% 

^2 

is valid. Here is the measure corresponding to the Markov process 
^i{t) on [^a, c] (it is a measure on is the measure 

defined by the equality 




516 



Chapter VII. Absolute Continuity of Measures 



(it is a measure with respect to A 2 on Note that A^nA 2 nA^ rep- 
resents a cylindrical set in ^[a,b] containing all the functions x(-) whose 
restrictions to [a, c], [c, c] and [c, b~\ belong to A^, A 2 and A^ respec- 
tively. The set A^nA 2 is defined analogously. We now establish an 
auxiliary result. Recall that the a-algebra (£ is called separable if a se- 
quence of sets A 2 ,... exists such that £ coincides with the minimal 

cr-algebra containing all the sets Aj^. 

Lemma. Let (^, ®) and £) be two measurable spaces, let and P 2 
two probability measures on ^ and let the probability measures (x, C) and 

V 2 (x, C) be defined on £ for each xedC such that Vj^{x, C) is ^-measurable 
for all cg(£. Define S x £ the measures by the equality 

7ik{B X = j* O’ CgG;. 

B 

If 7i2«7i I and if a separable a-algebra Cq exists whose completion with 
respect to measure v^{x, C) contains d p^-almost for all x, then 
andv 2 {x, •)«Vi(x, ') p 2 -^^^ost for all x. 

Proof. Set 

Q{x,y)=~(x,y). 

dit 

Then for all 5e® and Ce(£ we have 



7 i 2 (BxC) = j* q(x, y) Vi(x, dy) Hi{dx) = 






Q(x,y) 



B C 



I q(x, y')vi{x, 



Vi{x,dy) Q{x,y')vi(x,dy') 

dy') 'SI 



fii(dx). ( 2 ) 



Taking C=‘3^, we obtain that 



dfi2 

dUi 



(x) = 



g{x,y)vi(x,dy). 



( 3 ) 



Utilizing (2) and (3) 

712 {BxC) = 



we have 
V2(x, C) fijidx)-- 



q{x, y) Vi(x, dy) H2{dx), 



B 



B C 
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where 



Q y) = Q y) y Q /) Vi (x, dy') 

Hence, for all x the relation 

V 2 (x, C) = | Q{x,y)vi{x,dy) 



(4) 



(5) 



is satisfied for each C. 

Let Cjt be a sequence of sets generating Cq* Then one can find a set 
B*c:® such that jU2(5*) = l and such that for all xeB* and all Q the 
relation 



V 2 (:x:, Cfc) = J Q{x,y)v^{x,dy) 

Ck 

is valid. In this case, however, relation (5) is satisfied for all CgCCq ^Iso 
and hence this equality is also satisfied for all C belonging to the comple- 
tion of (£q with respect to measure Vi(x, •). The lemma is proved. □ 
Note that for stochastically continuous processes with values in a 
separable space one can always find a separable cr-algebra (this is the 
c-algebra generated by cylindrical sets over a countable set of values of 
the argument of the process, which is everywhere dense in its domain of 
definition) such that the completion of 5^? with respect to the measure 
associated with the process, contains 5- Applying the lemma just proved 
to measures represented by equality (1) we observe that almost for 
all X - with respect to the measure which is the distribution of ^2 (^) ~ 
<l^]c,[c,by Let denote the density of measure with respect 

to if the argument of this density is substituted by ^i(t) (here we 
assume that all the processes ^i(t) are defined on a certain fixed probability 
space {O, S, P}). Analogously denote by the density of the measure 

fiy,[c,b] with respect to with the same argument. Then it follows 

from formula (1), the lemma and formula (4) that 



Q[a, b] — Q[a, c]Q^i (c), [c, b] • (6) 

Denote by the subalgebra in the probability space {O, ®, P} 

generated by the variables ^^{t) for te[a, j6]. The functions 
Q^i{c),[c,b] ^re measurable relative to ®[a,c] ®[c,b] respectively. Consider 

the subdivision of the interval \_a,t]:a = tQ<t^...<tk — t.lt follows from 
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formula (6) that 

k- 1 
j=l 

Utilizing (6) we now prove the following theorem: 

Theorem 1. If ^^{t) and ^ 2 {^) stochastically continuous Markov pro- 
cesses, then the composite process {^i{t) \ Q[a,t]} is also Markovian. 

Proof. It is sufficient to show that for any continuous bounded function 
/ (x, 5 ) of two variables and se^^ the relation 

E(/(^l(0? Q[a,t]) I E(/(^l(0? Q[a,t]) | ^[a,n]) (^) 

is satisfied for a<t^<t. 

Since Q[a,n = Q[a,H]Q^AH),\tutY we have 

/ (^1 (O’ Q[a, f]) = ^{^1 (O’ Q[a, [ti, t]) ? 

where (p{x,Si,S 2 ) is a continuous bounded function on 
Assume first that (p{x, s^, S 2 ) = (p(x, S 2 ) ^(si). Then utilizing the measur- 
ability of (p (^1 (t), pi, i]) relative to f] and the Markovian property 
of (r), we have 

E(</>(^l(0’ ^{Q[a,tn) 1 ®[a,ii]) = 

— ^{Q[a,ti]) E(<?^(^l(0’ 1 ®[fl,fil)“ 

= ^{Q[a,ti]) E(<p(^i(0? I ^ 1 (^ 1 )) = 

= E(ll/{Q[a^ti]) <P(^l(0’ I ^l(^l)’ Q[a,ti])‘ 

Noting that both sides of formula (8) are linear in f and that linear com- 
binations of the form 

E <^k(pk{^. S2) iAfc(^i) 

can approximate any continuous function (p{x, s^, S 2 ), we have this veri- 
fied formula (8) and proved the theorem. □ 

Remark. Assume that for all te[a, 6] and all subdivisions of the interval 
[a, t\ equation (7) is valid where Q[^,t] and + certain vari- 

ables measurable relative to and + Then ^ 2(0 is a Markov 
process provided (t) is such and, moreover, the transition probabilities 
of process ^2 (0 are determined by the equality 

X, t2 ^)—^iXA{^l{h)) 1 ‘^l(^l))^i(ri) = JC- 

Indeed, for any collection of sets yli, in 91 we have 
^XAAUh))^^-XAdUtk))= 
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k 

j=2 
k- 1 

= EZA,('^l(tl))e[a.,d n 

j=2 
k— I 

7 = 2 



The required assertion follows from this relation. □ 

Consider now the construction of function by means of the tran- 
sition probabilities of the processes (t) and <^2(0- same time we 

shall obtain certain sufficient conditions for absolute continuity of the 
measures associated with these processes. It follows from the lemma 
(p. 516) that the absolute continuity of jll 2 with respect to implies the 
absolute continuity of the transition probability (t, x, s. A) of the 
process ^2 (0 ^ function of A) with respect to the transition probability 

X, s. A) of the process almost for all x (with respect to the 
measure which is the distribution of ^2 (0)- 



X, y) = 



x,^, •) 



(y) 



(9) 



(In the case when P^^^ is not absolutely continuous with respect to 
for a given x, g denotes the derivative of the absolutely continuous com- 
ponent of P^^^ with respect to P^^^ . Next let g^iy) denote the density of 
the distribution of the variable ^2 (^) with respect to the distribution of 
the variable <^i(a). If a = tQ<t^ ... <t„ = b is a subdivision of [a, b~] then 

n- 1 

Qaiil(a)) n Q{tk, tk+l,il{tk+l)) 

k=0 



coincides with the density of the measure with respect to \ where 
is the contraction of the measure fii on the d-algebra and yl„ is a 
finite set of values of the argument t:A„ = {tQ, t„}. If A„c: A„+i 
then the limit 



n->oo n-^oo ajli\ ' 



exists (with probability 1). If = U yl„ is every where dense in \_a, b^, and 

n 

the process (/) is stochastically continuous then the completion of 
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with respect to contains ^[a,b] hence the lim coincides with 

00 

Q[a,by 

Theorem 2. Let (/, x, ^, ^) / = 1 , 2 be the transition probabilities of two 
Markov processes (/) and ^2 (0 defined on [a, 6 ]. If the following condi- 
tions are satisfied: 

a) the distribution of ^2 (^) i^ absolutely continuous with respect to the 
distribution of ^\{a) with density 

b) for all xeSC, a^t<s^b, the measure {t, x, s, •) is absolutely 
continuous with respect to measure P^^^ {t, x,s, •) with density g{t, x, s, y); 

c) there exists a constant c such that 

f loge(f, X, s, y) X, s, dy)^c{s-t). 



then the measure fiz absolutely continuous with respect to and 



dp2 

dpi 



n- 1 

«-*-QO fe=0 



(^nife+ 1 ))? (1^) 



where a = t„Q< ...t^n^b and the sets A„ = {t„k, k = 0, n] satisfy the 
conditions: + ^ and y is everywhere dense in [a, Z?]. 

n 



Proof Introduce the process (^3 (t) with transition probabilities equal 
to the transition probabilities of the process ^2 (0 the distribution 

of (^3 (a) coincide with the distribution of ^i(a). Let b® the measure 
which is associated with the process <^ 3 (t) and be, as above, the 
restriction of measure pi to the cr-algebra where A„ = {t„j,, /c = 0 , ..., 
n}. Then as it is easy to verify 

- 1 

J („) (^ 1 ( ‘ )) ~n 
ap^ k=o 



Clearly, 



n-*co Up^ 



('^3(«)) = ea('^3(a))- 



Also the following limit 

n-l 

lim n e(t nh ^1 {^nk\ t/ik+ 1? nk+ 1)) 

n-*oo k= 0 
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exists with probability 1. Denote this limit by q'. In order that the relation 
/i 3 <^/ii be valid, it is sufficient that E^' = 1. In turn, in view of the remark 
following Theorem 2 in Section 1 it is sufficient for the validity of E^' = 1 
that the following expression 

n- 1 
k=0 

n-1 

xlog n Q{tnk, ^l(tnk), tnk+1, ^liU+l)) 
k = 0 



be bounded. Utilizing the equality 

^l{^nk\ ^/ik+1? ^l{^nk+l) | ^l(^Mk))— 1? 

we obtain that 

n- 1 

^ 0 Qi^nk^ ^l(^nk)? ^nk+1? ^l(^/ik+l))^ 
k=0 

n-1 

^ X] ^^8^(^nk5 ^l(^nk)? ^nk+l> ^l(^nfc+l))~ 
k = 0 

n-1 l-l 

^ X! 0 ^(^nk? (^nk)’ ^nk+ 1? (^nk+ l)) ^ 

1=0 k=0 



{^nb ^nl+ lr> y) ^ i^nh (^nz)? ^nZ+ 1? dy)^ 

n— 1 

X {t„,+ i-t„i) = c{b-a). 



Consequently, /^ 2 ^/^ 3 - Thus 

Formula (10) is a corollary of relation 

dj^_dfi2 dfi^ 

dfii dji2 dfii 

The theorem is proved. □ 

Consider the problem of constructing Markov process ^2(0 
which the associated measure fi 2 is absolutely continuous with respect to 
measure which is associated with the given Markov process 

Theorem 3. Let a = t„Q< ... <t„„ = b be a sequence of subdivisions of the 
interval [a, ft] such that the sets yl„ = A: = 0, ..., ^i} form an increasing 

sequence and U everywhere dense on [a, b\ Let a function a„(?, x, s, y) 

n 

be defined for each «, measurable in x and j, where x,ye^ a^t<s^b, 
satisfying the following conditions: 
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1) the limit 

n- 1 

lim ^ ^n/c+1? ^l(^«/c+l)) 

n-> 00 k=0 

exists in the sense of comergence in probability ; 



2) 



^cLn{t,x,s,y) X, s, dy) = 0{s — t) 



uniformly in x; 



3 ) 



^an(tnk,X,tnk+Uy) p(l) 



dy)=l 



for all n, k and x. Then the measure P 2 defined on %[a,b] the equality 

H2{A)=Ex^{U-))e'< 

will be associated with a certain Markov process on [a, bf 

Proof. First we show that the measure p 2 is a probability measure, i.e. 
Ee^=\. Let 



n- 1 



k=0 



il{tnk+ l))- 



Then it follows from condition 3) of the theorem that 

p \^^^(^a)edxQ} J][ f + + >< 

J k=0 J 

^ ^ + 1 ? dXj^ + ) 1 . 

On the other hand, in view of condition 2) 



n-l r k-1 

E e^^\ne^^ = E Yj P{^iWc^ixo} Yl 

fc = 0 J j = 0 



pCCnitnjfXj,t„j+i,Xj+i) , 






^Xn(tyik, Xtc, tylk + 1 5 ■^k + 1 ) ^ 



^n{tnk^ tnk+ ^k+ l) X^, j, dXj^^ l). 

Therefore Ee"^" In e"^^ is bounded and hence e^^ is integrable uniformly 
with respect to n and hence one may approach the limit under the sign 
of mathematical expectation in the relation E = To verify that the 
measure p 2 is associated with a Markov process consider the function 

^ (0 lim Yj ^ni^nk’ ^l(^nfc)? ^nfe+l? 

n-*co tnk<t 
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(here the limit is taken in the sense of convergence in probability). We 
show that this limit exists for all te [a, b] and coincides with the expression 

n-*co 

(here the limiting transition under the sign of the mathematical expec- 
tation is justified in view of the uniform integrability of the function 
For we have the equality 

Yj ^ni^nk^ ^nfc+1? il{^nk + = \ ®[0,r])* 

tnk<t 

Therefore the lim? 7 „ exists for te[JA„. If, however, 1 , then 

n 

lnE(c^” I ®[0,f])~ Y ^ni^nk^ ^l(^nk)’ ^l(^«k+l))~ 

tnk<t 

It follows from condition 2) and 3) that 






as n-^oo uniformly in j. Utilizing the fact that the variable exp{a„(t„j, 
{tnj}^ ^nj+ u ii {tnj+i))} IS Uniformly integrable with respect to j as well as 
the convergence of this variable in probability to 1, we obtain that 



!^\{tnj),tnj + \, + x)) 



in the sense of convergence in probability. 

The existence of rj (t) is thus proved. Let a<c<b and = exp {rj (c)}, 

Q[c tj^ = Qxp{rj{b) — rj(c)}. Clearly is measurable with respect to ®[c,d]. 
The proof of the theorem now follows by utilizing the remark following 
Theorem 1. □ 

Consider the particular case when ^^{t) and ^2(0 stochastically 
continuous processes with independent increments defined on [u, fo] 
and let ^.[a) = 0. Denote by the measure associated with process 
— for te\_oc, P~\ where [a, jS] cz [u, b]. Then relation /T 2 ^/^i 
implies relation p]^ py Let denote the c-algebra generated 
by the variables (t) — (a) for te\_oc, jS]. The variable 









is ^j-measurable. Let a = tQ<ti< ... <t„ = b be an arbitrary sub- 
division of the interval [a, b]. Consider the product of measurable 
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spaces and the product of measures 






k = 0 



defined on it. 

Since the processes ^i{t) have independent increments, it follows that 
these products of measures are mapped into measures fii under the 1 — 1 
measurable mapping of the product of the spaces + 

into i^ia,bp 5[fl,b]) by means of the formula 

m 

^ (^k) + 1 (0> ^ ^ ^ + 1 5 

k=l 

Therefore 






n-l ^r.(2) 






[tk, tk+l] 



iu-))- 



The factors in the r.h.s. are independent. Assume that measures and /^2 
are equivalent. In this case the densities are positive and we may take 
the logarithm of the last product. We thus obtain the following 



Theorem 4, In order that 



Q[a,t]- 



Ma!t] 



u-)) 



be densities of equivalent measures - associated with processes with inde- 
pendent increments defined on [a, t\ - it is necessary and sufficient that the 
composite process {t\ In jj} be a process with independent increments, 

Q[a,t] be ^[a,ty^^^surable and that the equality ,] = 1 be satisfied. 
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Measurable Functions on Hilbert Spaces 



§1. Measurable Linear Functionals and Operators on Hilbert Spaces 

Consider a measurable Hilbert space (^, S) on which a measure is 
defined. Every continuous linear functional /(x) defined on ^ is clearly 
S-measurable. It is known that if a sequence of continuous linear func- 
tionals /„(x) converges to a certain limit /(x) for all x then this limit will 
also be a continuous linear functional on The situation, however 
becomes different if we require that /„ (x) possess the limit not for all x but 
only on a set D such that ju(D)= 1. It is natural to refer to such limiting 
functions /(x) as ®-measurable functionals. These functions, being limits 
of sequences of measurable functions will also be ©-measurable. It 
follows from the relations 

lim /„ (ax + jSj) = a lim /„ (x) + jS lim /„ ( 7 ) 

n-*oo n-*ao n~* cc 

that the domain of definition Di of functional /(x) is a linear manifold 
and that /(x) is a linear (additive and homogeneous) functional. (We 
assume that the functional /(x) is defined wherever the corresponding 
limit exists.) Hereafter we shall consider non-degenerate measures /x 
such that fi{L) = 0 for any proper subspace L of the space Since 
fi[Di)= 1, the set Di is dense in Thus if /(x) is a measurable functional 
- in the sense stipulated above - then 1 ) it is defined on a S-measurable 
linear manifold Di such that ix{Di)=l ; 2 ) /(x) is a ©-measurable function 
3) /(x) is linear on Z)^. It turns out that these conditions are sufficient for 
/i-measurability of /(x). This follows from 

Theorem 1. If a function /(x) satisfies conditions l)-3) then a sequence of 
continuous functionals /„(x) exists such that 

I (x) = lim /„ (x) (mod /x) . 

n-*co 

Proof. We shall construct a sequence of continuous functionals /„(x) 
which converge to /(x) in measure /x. Since a subsequence can be ex- 
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traded from this sequence converging /i-almost everywhere, the theo- 
rem will be proved. Let {x: |/(x)| <c}. Since lim jU — 5'^) = 0, for 

c-^ao 

each ^ > 0 one can find c and a compactum KczS^ such that fi{Di — K)<£. 
Without loss of generality we may assume that K is convex and sym- 
metric since is such. Since /i({0}) = 0 one can find a ^>0 such that 
also 



(here S^(0) is a sphere of radius 3 with the center at point 0). For 
xeK-S^{0) the inequality 

\l{x)\^c 

|x| 

is satisfied. This inequality is also satisfied for all xcif where ^ is the 
linear hull of the set A"— 5^(0). Hence /(x) is a bounded linear functional 
on if. In view of Hahn-Banach’s theorem there exists a linear extension 
of / onto the whole space with the same modulus of continuity. We 
denote this extension by ^(x). Let be the convex hull of the set 
K-Ss(0). Then 

From it follows the existence of a sequence of bounded linear functionals 
converging in measure fi to /(x). The theorem is thus proved. □ 

Corollary. If l{x) is a measurable functional and if if„ is a sequence of 
finite-dimensional subspaces such that S^^eDi and uif„ is dense in and 
P„ is the projector on if„, then /(P„x) converges in measure p to /(x). 

Indeed, let be the compactum constructed in the proof of the 
theorem. If n is chosen in such a manner that if„ forms a 8^/c-net in 
(here 3 and c are as in the proof of the theorem), then 

\liP„x)-l(x)\^sup^Jf^\P„x-x\^j\P„x-x\<e 

\y\ O 

for all X G iC 1 . Therefore 

!^{{x--\l{P„x)-l{x)\>s})<s. □ 

In order to construct the space of all /i-measurable functionals it is 
convenient to use the characteristic functional cp(z) of measure p. Let a 
sequence of continuous functionals (z„, x) converge in measure p to 2 l 
certain measurable functional /(x). Then for each real t 

lim exp (i7(z„ — z^, x)} = 1 . 
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Hence 



lim <p(t{z„-zj)= lim 




x)} fi{dx)=l . 



(1) 



Let 



k(z) = 






dt. 



Then the necessary and sufficient condition for the existence of the limit 
in measure jj. of the sequence (z„, x) is the condition that 

lim /c(z„-zj = 0. 



The necessity of this condition follows from (1) and the theorem on 
the limiting transition under the sign of the integral. To establish suffi- 
ciency note that 



/c(z) = 







1 

T+? 



dt 



JLL{dx) = 71 






Therefore for any £>0 

71 



which implies the convergence of (z„, x) in measure fi. 

Since 

/c(Zi 4 -Z 2 ) — 71 J ^l__^-\(zux) + {z2.x)\ 

^71 (1 Ji{dx)^7i J (1 — fi{dx) + 

+ 7C J (l-el*^^’^>l)/i(</x) = /c(zi) + /c(z2), 

^ may be regarded as a metric space with the metric 



r{x,y) = k{x-y). 

Let §C denote the completion of ^ in metric r. Each element of §t can 
be associated with a certain /x-measurable functional l{x): x<->/, provided 
a sequence z„ exists in ^ such that r(z„, x)^0 and (z„, x)^l{x) in measure 
fi. Denote by ^(/i) the space of all /i-measurable functionals. We shall 
identify those functionals which coincide /z-almost everywhere. Then the 
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correspondence S between ^ and if(ju) becomes one-to-one. By in- 
troducing the distance 



r(li, li) = T^ 



{\—e 



J 



in if (//), this correspondence becomes isometric. It is therefore natural to 
identify the spaces and if (/i) and we shall follow this procedure here- 
after. 

We note another special feature of the space §t with metric r. The 
characteristic functional of measure in can be extended by continuity in 
metric r to the whole This extension can be written in the form 



(here I is a measurable functional regarded as an element of ^ = ^{ii)). 
We show that ^ is, in a certain sense, the widest possible space to which 
(p{z) can be extended by continuity. 

Let ^ be a linear metric space with metric q such that q{x, y) = 
q{0,x- y) and let (p{z) be continuous in this metric ^ on ^ and extendable 
by continuity onto Since cp is continuous in metric q for any e > 0 a (3 > 0 
can be found such that Rq{1—(p{z))<s provided ^( 0 , z)<(5. Then uti- 
lizing the inequality 






2(1 — cos(zi — Z2, x) n{dx)^-^2 Rq{ 1 — (p(z ^ — Z 2 )) , 
we find that for Re (1 — (p (z)) < s 

n 

Re (1 — (nz)) ^ ^ \(p{{k—l) z) — (p {kz)\ ^ riy/ls. 

k= 1 

Therefore for q (0, z)<d 

dt 



I 



k{z)= Re(l— (p(lz)) 






C dt 

= J Re(l-cp(tz))j^ + 



dt 

Re{l ~(p(tz))j^^ 



\t\^n 



^nn y/^-\ — . 



\t\>n 
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Hence if ^(z„, z^)^0 then /c(z„ — z^)->0, so that ^ can be isometrically 
imbedded into a certain subset of 

The space ^ is significantly wider than since it contains, for 
example, the spaces obtained by completion of ^ in the scalar 
product (x, y)_ =(Bx, y), where B is a kernel operator such that (p{x) is 
continuous in the scalar product (Bz, z). 

In addition to the space of all measurable functionals ^{fi) one may 
consider the space of all square integrable linear functionals. 

However, this space may consist of only the null element. If the measure 
jn possesses a finite correlation operator C, then will contain 

the completion of ^ in the scalar product (Cx, y), but may not neces- 
sarily coincide with this completion. Moreover, it may occur that 

i* 

(x, z)^ju(dx)= + (X) for all z^O while may contain elements 

different from the zero element. 

As an example, consider the measure fi which is the distribution of a 
random element ^ of the form 



E ^kflkek, 

k= 1 

where {e^} is an orthonormal basis, rjj^ a sequence of identically distributed 
independent random variables with stable distributions : 

First let y = 0 and a > 1. Then z) possesses a stable distribution with 
the same exponent a. Therefore for any functional /(x) in 

f n(dx) = e~'^'\ 



Hence, 



F(x) fi{dx)< CO only if /(x) = 0. 



1 " 1 

If a> 1 and y^O then by choosing a sequence z„ = - V ~^fc,wehave: 

rik=i h 

1 ” 

(^, z„) = - Y, ^k- Therefore with probability 1, i.e. /^-almost everywhere, 
w /c= 1 

the limit 



lim (x, z„) = E»7;t = y 



exists. Clearly /(x)= lim (x, z„) belongs to 
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On the other hand (z, x) will have a stable distribution with exponent a 
and hence 



J {z,xY jbL{dx)= CO for z/0. 

These examples show that in the general case, it is (somewhat) unnatural 
to consider the space 

Measurable linear operators. As in the case of measurable functionals, 
measurable linear operators are defined naturally as limits - in measure 
ju - of a sequence of continuous operators. Since there may be a strong 
or weak convergence of the sequence A„x, we may define the measur- 
ability as either strong or weak. Thus an operator A is called strongly 
(weakly) measurable with respect to measure if a sequence of continu- 
ous linear operators A^ exists such that A^x converges strongly (weakly) 
to Ax {mod fi). Obviously a strongly measurable operator is also a weakly 
measurable one. Let A be weakly measurable. Denote by the set of 
X such that the weak limit of the sequence A„x exists. Then denoting by 
N a certain countable set dense in ^ we will have D^ = {x : sup \A„x\ < oo ; 

lim (z, A„x), zeN, exists}. From this relation it follows that is mea- 
/!-► 00 

surable. It is also clear that is a linear manifold. The weak limit 
Ax= lim A„x exists for all xeD^ and moreover A{ax + py) = ocAx-\- P Ay 

n-^ oo 

for all real a and p and x, yeD^. Finally we find that ii{D^ = 1 . We show 
that the above conditions are sufficient even for a strong measurability 
of operator A. This will also show that the notions of weak and strong 
measurability are equivalent. 

Theorem 2. Let a measurable function Ax with values in ^ satisfying the 
relation A{ocx-\- Py) = (xAx + pAy for all x,yeDj^ and real oc, P be defined 
on a certain measurable linear manifold such that g{Dj)= 1. Then a 
sequence of continuous linear operators A^ exists such that Ax= lim A„x 

«-^oo 

(mod ju). 

Proof. Note that \Ax\ is a measurable function. Therefore 
lim p{{x:\Ax\>c}) = 0, 

C-*- 00 

and hence for each e > 0 one can find a compactum K such that \Ax\ < c 
for xeKand — K)<s. This compactum may be considered a convex 
and centrally symmetrical set. 

We choose 3 as in the proof of Theorem 1. Let and if be as in 
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Theorem 1 . Then 



\Ax\^c 

|x| 

Let N he a finite-dimensional subspace of jSf such that NnK^ forms 

a - e-net in K^. Construct the operator Aj^ in the following manner: for 

c 

xeN, Aj^x = Ax, if y is orthogonal to N, then Aj^y = 0. This extension of 
A from N onto the whole ^ does not Increase the modulus of continuity 

of A. Therefore — — x| <s for xeK^. Hence, 

ye^ \y\ 

JLi{{\A^X — Ax\> £})<£. 

Choosing sequences e„-^0 we construct a sequence of bounded linear 
operators which converges to A in measure and a subsequence can then 
be extracted from such a sequence which converges almost everywhere. 
The theorem is thus proved. □ 

Henceforth we shall use the term “a measurable linear operator” 
without specification as to strong (or weak). 

We now consider the notion of an absolutely measurable linear oper- 
ator. A measurable linear operator A is called absolutely measurable if 
for each measurable linear functional l(x) the expression I (Ax) is also a 
measurable linear functional. The last assertion may be interpreted in 
two ways. First, since a sequence exists such that /:(/— z„)->0, one can 
understand I (Ax) as the limit in measure of the sequence of measur- 
able functionals l„(x) = (z„. Ax). Secondly, one may interpret I (Ax) as the 
standard superposition of two measurable functions. This superposition 
is also measurable and the condition of additivity and homogeneity is 
fulfilled in the domain of the definition of this function. The set 
{xiAxeDi} where Di is the domain of the definition of l(x), - is the 
domain of the definition of this function. If denotes the range of 
values of the operator A, then in order that I (Ax) be a measurable 
functional it is necessary and sufficient that the equality 

n{A-^{AAnD,))=\ 

be satisfied. Moreover since any measurable linear manifold L such that 
/i(L)=l may serve as the set Di the condition ^(A~^(A^r\L))=\ must 
be satisfied provided /z(L) = 1. Utilizing Theorems 1 and 2 one can verify 
the equivalence of both interpretations of the measurability of l(Ax). 

We now describe the structure of an absolutely measurable operator. 
Note that for any absolutely measurable operator A the convergence in 
measure /i of a sequence of measurable linear functionals /„(x) to /(x) 
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implies the convergence in measure ^ of the sequence 4(^x) to l{Ax). To 
show this it is sufficient to consider the convergence almost everywhere 
in place of the convergence in measure. In this case, one can find a linear 
manifold L with /i(L) = l such that lj^(x)-^l{x) for xeL. Hence l„{Ax)-^ 
~^l{Ax) for all x such that AxeL and this set is of measure 1 in view of 
the absolute measurability of the operator A. Thus an operator can 
be associated with A which maps into acting according to the 
formula [^*/] [x) = l{Ax). It follows from the above that this operator 
is continuous in metric r since the convergence of functionals in this 
metric is equivalent to their convergence in measure /i. We show that 
the converse is also true : if ^ is a measurable linear operator such that 
operator defined by the relation A*z{x) = (z, Ax) for all z€^ is ex- 
tended by continuity in metric r over the whole then A is an absolutely 
measurable operator. If A* is extendable by continuity on ^ then 
will be an r-continuous positively definite functional. Therefore the series 

Ax=Y 

k= 1 

is convergent //-almost everywhere in any orthogonal basis {e^} and 
moreover (p{A*z) will be the characteristic functional of the variable Ax 
defined in this manner. Let / be a measurable functional, {fj,} an ortho- 
normal basis in Di. Then 

00 00 

Ax= X {Ax,f^)f^= X (A*fj,x)fj. 

fc=l fc=l 

Let be the projector on the subspace spanned by /i, ..., We show 
that l{P^Ax) is convergent in measure to a certain limit. Indeed, 

l(PnAx)= f {A*fj,x)l{fj) = \A* X 

j=i L j=i J 

and since ^ ^ Kfj)fp ^ = converges in measure // to l{x) in view 
of the corollary of Theorem 1, we have 

i l{fj)f]{x)MA-l)(x) 

L j=i J 

in measure // in view of the continuity of A* in Hence 

/(^x) = [^*/] (x) 

is a measurable linear functional for any linear functional /. The absolute 
measurability of A is thus proved. □ 
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We now consider measurable linear mappings of one Hilbert space 
^ into another Hilbert space ^ . 

We shall discuss only strongly measurable mappings. It turns out 
that the study of such mappings easily reduces to the study of measurable 
mappings of into i.e. of measurable linear operators. Indeed, let R 
be an isometric one-to-one mapping of ^ onto ^ (we assume that both 
spaces are separable). Let F be a measurable mapping of into then a 
sequence of continuous linear mappings can be found such that 
V„x-^ Vx (in in measure fi. But RV„issi sequence of continuous linear 

mappings of ^ into ^ convergent in measure to RV. Hence jRF is a 
measurable linear operator. Conversely, if (7 is a measurable linear 
operator mapping ^ into then UR~^ is a measurable linear mapping 
of ^ into We thus have completely described all the measurable 
linear mappings of ^ into ^ . 

Denote by v the measure defined on ^ by the relationv {E) = ji{V~^ (£)), 
where F is a measurable linear mapping of ^ into ^ . We find the charac- 
teristic functional of measure v. For this purpose the notion of a conjugate 
mapping of F will be required. Let D be the domain of definition of 
mapping F with ju(D)= 1. The expression {Vx, y) is defined for all 
and xeD and is a measurable functional on D. Therefore an element 
lyE^ exists such that {Vx, y) = ly{x). Set ly=V^y; V*y defines a homo- 
geneous and additive mapping of ^ into # which is continuous in the 
following sense: r{V*y^, V*y2)-^0 as |yi— This mapping F* is 
called the mapping conjugate to F The case when F* can be regarded as a 
measurable mapping (with respect to measure v) of ^ to ^ is of special 
interest. Let {ej,} be an orthonormal basis in In order for V*y to 
belong to ^ it is necessary and sufficient that 

CO 00 

V*y= X {V*y, gfc) X (>’’ 

k=l k=l 



and that the series ^ {y, Ve^Y be convergent v-almost everywhere. The 

k 

last assertion is equivalent to the ^u-almost everywhere convergence of the 
series ^ {Vx, Ve^Y. Finally we find cp^{y) the characteristic functional of 

k 

measure v. Denote by (p^{l) the extension by continuity of the characteris- 
tic functional (pf^{z) on Then 



(pv{y) 



I 






fi{dx) = (p^{V*y). 
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§2. Measurable Polynomial Functions. Orthogonal Polynomials 

Despite the fact that, in the previous section, we have established - even 
for the case of linear measurable functions - the basic difference between 
the spaces of measurable functions and the spaces of square integrable 
functions, we shall, nevertheless, when studying polynomial functions of 
higher degrees, confine ourselves to the case of square integrable func- 
tions which arise as mean square limits of continuous polynomial func- 
tions. The reasons for this restriction are, on one hand due to the com- 
plexity of the structure of square measurable functions (let alone one 
higher polynomial function), and on the other hand, due to the ease 
with which the mean square integrable functions can be used in various 
analytic applications, in particular for the construction of orthogonal 
expansions. However, to assure the existence of nontrivial square inte- 
grable continuous polynomials, certain restrictions should be imposed 
on the measure /i. 

Denote by the class of measures ju such that 
J |x|” fi{dx)< CO and let = H 

Gaussian measures are examples of measures in M^. In the case of 
measures ja belonging to every polynomial of degree n will belong to 

[ju] which is the space of measurable functions, square integrable 
with respect to /i. Moreover, if ixeM^, then contains all the 

continuous polynomials. We recall the definition of a polynomial 
function (or simply a polynomial.) A function 0(x) which can be repre- 
sented in the form 



4>(x) = H(x,..., x), 

where H{xi,..., x„) is an n-linear form on ^ is called a homogeneous 
polynomial of degree n and the function of the form 

Ux)= t 

k = 0 

where ^ is a homogeneous polynomial of degree k is called a polynomial 
of degree n. For each homogeneous polynomial 0 of degree k there 
exists a /^-linear continuous symmetric function ^(x^, ..., xj generating 
the polynomial (or associated with the polynomial). Such a function is 
uniquely determined. Let {e^} be an orthonormal basis in The numbers 

ik •• •? ^ifc) 

are called the coefficients of the function H and the form ^ in this basis. 
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The form <P is expressed in terms of its coefficients in the following manner: 
^(^)= Yj ^ik)‘ Consider the expression for the 

iu ik 

integral J T„{x) ju(dx), where and 7^'. are polynomials in terms 

of the characteristics of measure fi. Since T„{x)' T^ {x) is a polynomial it is 
sufficient to be able to determine integrals of a single polynomial and to 
do this, it is sufficient to determine integrals of homogeneous forms. Let 
fieM^ and 0{x) be a homogeneous form of degree n. Denote by H the 
corresponding n-linear continuous function. If P is the projector on a 
certain space, then 

\0{x)-0 {Px)\ = \H{x,...,x)-H (Px , . . . , Px)| ^ 

n 

\H(x,...,x,Px,...,Px)- 
1 n 

— 7/(x, ..., X, Fx, ..., Fx)| = 



n 

= Yj |77(x, ..., X, X— Fx, Px, ..., Px)| ^ 

k= 1 — — 

^nC|x — Fx| |x|"~^. 



where C = sup[i7(xi, ..., x„); |Xj|^l]. Since (P(x) — (P(Fx)->0 asF-> /and 
is bounded by the quantity 2nC|x|” integrable with respect to measure 
fi, it follows that 



I 



0(x) fi{dx) = \im 
PU 






<P{Px) fi{dx). 



We choose an arbitrary orthonormal basis {e^} and denote by the 
projector on the subspace spanned by ^i, ..., Then 



0{x) fi{dx)= lim 



P{dx). 



If ^ are the coefficients of the form in the basis then 
<l>{PmX)= Z i„(^> e, (x,ej 

ik^tn 
k=l,..., n 



and 






^(PmX) P(dx)= Z 

ik^m 



{x, e;,)...(x, ejfi{dx). 
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Let 



Sf(zu . . z„) = j* (x, zi) . . . (x, z„) n(dx) 
be an n-moment form for measure //. Then 






(x, eij...(x, ejuidx) 



are the coefficients of this form in basis {e^}. Consequently, 

0(x)n{dx)= \im X 

m-*ao 

k= 1 , n 

To prove that the expression on the right of the equation is the sum of 
a convergent series, note that relation 

^(x)n{dx)= lim \ H{P„^x,..., P^„x) n(dx) = 

mi ^ 00 , m„-» 00 J 



mi rrin 



lim 



mi-*- 00, m„-»oo 



X ... E 



1 1 = 1 in = 1 



is satisfied. This implies the convergence of the series 



E (1) 

ll, in= 1 

If for two n-linear symmetric continuous functions H and the series 
(1) is convergent in any orthonormal basis, then the sum of this series 
(which is independent of the choice of the basis) will be denoted by 
Sp and will be called the trace of the product of these forms. Thus 

formula 

f* 

<P{x)n{dx) = SpH*S^"^ 

% 

is verified where H is an n-linear form corresponding to the homogene- 
ous form *P and is the n-moment form of measure 

Construction of an orthogonal system of polynomial functions. Hence- 
forth we shall assume that The mean square limit of continuous 

polynomials of degree n will be referred to as a measurable polynomial 
of degree at most For construction of all the measurable polynomials 
it is convenient to utilize an orthogonal system of polynomials. Let 
be the set of all measurable polynomials of degree at most w ; is a 
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subspace in the Hilbert space Clearly ... The 

subspace of which is the orthogonal complement to ^ is denoted 

by The subspaces mutually orthogonal and are 

called an orthogonal system of polynomials. Every measurable poly- 
nomial can be uniquely represented in the form ^ W? where 
To construct all the measurable polynomials it is sufficient to construct 
the whole subspace Such a construction is normally carried out by 
induction. 

Let r(x) be a homogeneous form of degree n generated by a sym- 
metric fz-linear function H. Then 



T’(x) = P„ (H, x)~Y Qk {H, x) , (3) 

k =0 

where x)e^„ and Qk{H, x)e^k. Clearly x) and Qk{H, x) are 
linearly dependent on H. Denote by the space of all w-linear continu- 
ous functions. Introduce in the scalar product 



<H, H'}„ = 



P„{H, x) P,{H\ x) fi{dx) 



( 4 ) 



and complete the space with respect to this scalar product. The Hilbert 
space obtained will be denoted by cP" and its elements will be called 
generalized forms. Note that the correspondence x) is iso- 

metric and therefore can be extended over the whole The function 
in which corresponds to Hg<P'' will also be denoted by x). 
Functions Qk{H, x) in formula (3) can be represented in the form 
Qk{H, x)— Pk{Hk, x) where The linear operator which maps 

if 6^" into will be denoted by Kk- We thus obtain from (3) 

P„(H, x)= T(x) + V PkiKkH, x). (5) 

k =0 

The last formula shows that in order to determine F„(if, x) for if 
it is sufficient to know operators Kk, while to extend P„(if, x) onto 0 ” 
it is necessary to know the scalar product <•, *)„. If these two charac- 
teristics are known one can reduce the construction of P„(if, x) to the 
construction of Pfc(if, x) for k<n. 

Denote for an arbitrary if„e^” and ff^e^^ the (n + /c)-linear form 
ii„(xi,..., x„) iffc(x„+i,..., x„+fc) (this form is asymmetric) by if^xif^. 
The following recursion relation follows from formulas (2), (4) and (5): 

<//, //'>„ = Sp(H X V <KkH, V„kH'\. (6) 

fe = 0 

To determine Kk we introduce bilinear forms y4„fe(if„, if^) defined for 
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and 



H,)= 



T(x) Pk{H^,x) n{dx). 



( 7 ) 



where T{x) is a homogeneous form of degree n and He^" is the corre- 
sponding n-linear function. To determine A„,^ we have the recursion 
relation 

H,) = Sp(H„ X X VM. (8) 

7 = 0 

which follows from (5) and (7). Finally to determine V„u we have the fol- 
lowing equality which is valid for all and 

(9) 

(To obtain this relation, multiply (5) by x) and then integrate.) If 

m = 1, 2, . . .} is 3.n orthonormal basis in then 



K,H=- X (10) 

m= 1 

Relations (6), (8) and (9) enable us to determine successively <*, ->o, ^lo, 
^ 10 . <S *> 1 ,^ 20 . ^ 20 . A 21 , F 21 ,... and so on. 

As an example consider the construction of subspaces P„ for the 
Gaussian measure with mean 0 and the correlation operator B. All the 
formulas will be significantly simplified if the scalar product 

(x, y)+ =(Bx, y) 

is utilized in which the traces are calculated. The moment forms of mea- 
sure ju are especially simple in this scalar product - they are 5^ (z ^ , . . . , z„) = 
= 0 for n odd and 

n/2 

s;(zi,...,z„)=x n 

fe=l 



for n even. Here the summation is carried over all possible partitions of 
numbers 1, 2, . . ., n into njl pairs (i^, j^. The last formula follows from the 
equality 

Zn) = 



a" f 

= i ” < 

aai...aa„ J 



exp<i x,Y,<^k,^k)\p{dx)\ 



ai = 0 

«n=0 



d” 






— exp< -i X “kZfc’ Z . 

doc^...c(Xf^ \i 1 2 +Jai — 0 
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The traces in this scalar product (*, •)+ will be denoted by Sp+. We in- 
troduce the mapping of into defined by the formula 

Sp+ - 2 ) ~ X] ^ 1 ? • • •» 2)9 

k 

where {e^} is a certain orthonormal basis (with respect to the scalar 
product (*, •)+). We shall construct successively the operators V„, and 
the scalar products < *, • >„. First we compute 

Sp+ {H„ X Hk*Sn+k)- 

The quantity is zero for n + /c odd. Let n + fc = 2m. Note that for the evalua- 
tion of Sp+(JT„+fe*S^^^), where is an (n + /c)-linear form, one should 
subdivide the arguments of iiifo all possible pairs, then convolve in 
each pair (i.e. substitute in place of this pair of arguments the identical 
vectors from the orthogonal basis and to sum up over the basis) and 
finally add the results. Let the arguments of x be subdivided into 
pairs in such a manner that there are S pairs containing the arguments of 
H„ and Convolving each form separately over the remaining pairs 
we obtain ^ 

{Spl)^ H„ and H,. 

The number of such partitions is 

C^k{k-l)...(k — s-\-l) (n — s — 1) ! ! (/c — s — 1) ! ! = 



nlkl 



s\{n~s)l \ {k — s)\\ 



^ 

Hence putting = 7 , ^ — = r we obtain 



2 2 

Sp+ {H„ X = 

^ nlkl 



sp4i^piy^^HASpiyH,}, 



jik /2 sl{n-s)ll {k-s)ll 
Utilizing this formula we determine 

^ 2 „.oH 2 „=-( 2 n-l)!!(Spi)"if 2 „, 

^2„+i,i^2„+i = -(2« + l)!!(Spi)"H2„+i. 

Next 



A2„.2(H2n, H2) = 



(2n)! 



Sp^iiSply-^ H,„xH2) + 



{2n-2)U 

+ (2n-l)!! (Spi)" H^„(SplH 2 )-A 2 „^o{H 2 n, V^oH^)^ 
{2n)l 



{2n-2)U 



Sp+((Spi)"-i 
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Consequently 



V2„.2=- 



(2n)! 



(2n-2)!! 2 






It can be verified by induction that 

A2„,2u(H2n, = Sp+ ((Spi X H,,), 

H2,}2k = (2k)\Sp^H2,*H2, 

and hence 



V2n,2kH.„^- 

In the same manner 

^2n+l, 2k+ 1 ^ 2«+1 = 



(2n)! 



2n.2k^^2n (2n - 2k) I \ {2k) \ 

(2n + l)\ 



(2n-fe)!!(2fe+l) 



(Sp2)"-'‘H2„. 






Thus we finally obtain 



<H„, H„y„ = n\Sp^H„*H„, 



0; n + k odd 

We now investigate the question of which measurable polynomials 
are dense in if 2 M- A sufficient condition is given by the following lemma. 

Lemma. If the characteristic functional (p{z) of a measure fi is such that 
for each z the function (p{tz) is an analytic function of t in a certain neigh- 
borhood zero, then the set of measurable polynomials is dense in if 2 [i^]- 

Proof. Denote the closure of the set of all measurable polynomials by 
We show that regarded as a function of x belongs to For 
this purpose it is sufficient to prove that for some sequence of real-valued 
polynomials q^{t) 

lim { \e^^^^^^-q„{{z,x))\^ p{dx) = 0. (11) 

n^co J 

Denote by F(t) the distribution function 

F(A) = f 2 ({x:(z,x)<A}). 
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Relation (11) is equivalent to the following: 



lim \e'^ — qn{X)\^dF{X) = 0. 



(p{zt) = j e'^^dF{X), 

it follows from the analyticity of this function in the neighborhood of 
zero that for some ^ > 0 



J dF{X)< 00 . 

Let ^ be the space of complex-valued functions g{X) such that 
\g{X}\^ dF{X)<cc with the scalar product 



g,{^)g2(X) dF{X), 



Let be the closure in of the set of all polynomials and let g{X) be 
the projection of function on Then 



{e^^-g(X)) r dF(A) = 0 



for all n^O. 

Utilizing the inequalities 



1 n! in! 



for |t|<- we obtain that 
2 



0=lim (e^^-g(A)) Z —rdF(A)= (e^^-g(A)) dF(A). 



n=l nl 



Differentiating this relation with respect to t we find that for \t\<- 



{e^^-g{X)) 2,” dF{X) = 0. 
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<5 b 

Hence we have for |t|<- and |«|<- 
2 2 



0 = 



X W dF{X) = 
" = o n! 






or 



0 = J — g (1)) dF{X) for \t\<d. 

Continuing the previous arguments we observe that for all t 
{e^^-g(^)) e^^UF{X) = 0 



Setting r = — 1 we obtain 

f* 

ie‘^-giX))e-‘UF{X) = 0, 

%> 

which together with the equality 

I {e^^-g{X))g{X)dF(X)) = 0 

yields the relationship 

j \e‘^-g{Xf dF{X)=0, 

which implies the limit (12). We now show that any continuous bounded 
function belongs to the set Let / (x) be such a function. We choose a 
compactum K such that g{^ — K)<s. The function f{x) is uniformly 
continuous in K hence for each s>0 a (5>0 can be found such that 
l/W-/(y)l<£for |x-y|<^. 

Let N be a finite-dimensional subspace which is a <5-net in K and 
denote by P the projector on N and let K' be the projection of K on N. 
There exists a trigonometric polynomial T on N such that |/(Px) — 
— T (x)| < ^ for XG and which does not exceed sup |/ (x)| in its absolute 

X 

value. It is easy to see that 

j* \f(x)-T(Px)\^ n{dx) = 0{e). 

Hence in addition to trigonometric polynomials all the bounded 
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continuous functions belong to ^ and hence also since the 

bounded continuous functions form a dense set in if 2 M- The lemma is 
thus proved. □ 

We now present an example which shows that even on the real line 
there exists a measure for which all the polynomials are square integrable 
but are not dense in if 2 [/x]. Let the measure ju on be defined by the 
density 

111 

x>0, 



/w= 



— — 



^/lit 



0, x^O. 

Consider the following function g{x) belonging to if 2 M^ 

fexp{e log^x} sin7r(l— 2e) logx, x>0, 
g(x) = \ 

I 0, x^O, 8<i. 

For all integer valued k we have the equality: 



I 



x''g{x) f{x)dx = 



1 



■V^ 



e*' e sin n(l— 2s) t dt = 



00 

= I exp{{k + m{l—2s))t—j{l—2s)t^}dt- 

2L^% J 



00 

^ — exp{(/c — 171(1 —2e)) t—^{l—2e) t^} dt = 

7L J 



liyjlji 

1 

2/^1 — l 8 



I- 2(1-26) J 



Thus ^(x) is orthogonal to all the polynomials. 
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Let ^ and ^ be two Hilbert spaces with the or-algebras of Borel sets 91 
and S respectively. The function i^(x) defined on a 9l-measurable set 
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and taking on values in ^ is called a measurable mapping of ^ into 
^ provided for all Such a mapping is called g-mea- 

surable or measurable with respect to measure g provided /i(Z)|^)=l. 
Henceforth the notion of measurable mapping is used in this section in 
the sense of a mapping measurable with respect to the corresponding 
measure. If a mapping R is /i-measurable then it maps measure g into 
a measure v on (^, ®) defined by the formula 

Evaluation of integrals with respect to measure v are reduced to their 
evaluation with respect to measure p: for any ®-measurable function 

Ay) 

J* f{y)^{dy)=^ f{R{x))Adx)^ 

provided only one of these integrals is well defined. An important ques- 
tion in many applications is the problem of determining the characteristics 
of the measure v (i.e. the characteristic functional or moment-functions) 
from the known characteristics of the measure p. 

The simplest example of a measurable mapping is a continuous map- 
ping of ^ into The theorem below shows the connection between 
continuous and measurable mappings. 

Theorem. For any p-measurable mapping one can find a sequence of 
continuous mappings such that 

R {x) = lim Rf^ (x) (mod p ) . 

n-*oo 

Proof. It is sufficient to show that for any a>0 one can find a continu- 
ous mapping ^(x) of ^ into ^ such that 

/I ({x : I ^ (x) — (x) I > 2}) < e . 

Denote by v the measure on (^, ®) which is the image of p under R. 

g 

Let K he a, compactum in ^ such that v(^ — K)<~. Denote by K' the 

g 

preimage of K under R. Then /i(^ — AT')<-. Let N be the finite-dimen- 

£ 

sional linear subspace in ^ which is a ^ Ti? Tm be a basis 

in N. Then for all xeA^ 

m g 

\R{x) yk) yk\<i^- 




§3. Measurable Mappings 



545 



Since (R (x), yj^) is a measurable function (with respect to jll) a continuous 
function (Pk(x) exists such that 






x:\(Pk(x)-(R(x), yt)\> 



e 

2m 




Then 






x: 



R(x)-Y.9k(x) yk 



>8 



< 



<fl 



R(x)-Y.(R{x), yk)yk 



8 

>- 



2 






x:\{R(x), yk)-(pk{^)\> 



8 

2m 



+ 



To complete the proof, we note that the mapping R{x)= ^ (Pk{^) yu is 

k= 1 

continuous. □ 

When studying measurable mappings of ^ into it is sufficient to 
consider measurable mappings of into since every measurable 
mapping of ^ into ^ can be represented as a composition of a measurable 
mapping of ^ into ^ and a continuous mapping of ^ into 

Polynomial mappings. Consider a mapping of SC into SC . The mapping 
R is called polynomial if (/^(x), z) is a polynomial in x for any z. If for 
some z the expression (i^(x), z) is a homogeneous polynomial form of 
degree n, then we say that jR(x) is a homogeneous polynomial mapping 
of degree n. When studying homogeneous polynomial mappings the 
standard mappings - to be described below - play an important role. 

Denote by the space of /^-linear symmetric continuous forms S 
satisfying the condition 

Sp5'*5'<oo. 

The space SC^^ with scalar product 

(5, T) = SpS*T 

is a separable Hilbert space. Denote by the cr-algebra of Borel sets 
in Consider the mapping of ^ into defined by the relation 



■T^{zi,...,Zk)= n i^px). 

J=l 



This mapping is continuous since 
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k fc “12 

ri(^i>^b)- 0(^2. ei. = 

L\j=l 7=1 

r k i-i k ^2 

= Z lZ Yl n (^2.e;.)^ < 

iu...,ik Li=i j=i j=i+i ) 

l^k 



Consequently the mapping is measurable. Introduce on the measure 
by means of the formula 

The measure is called the /c-th power of the measure fi. Note that 
is the distribution of the random variable with values in in the 
probability space ©, fi). 

Denote by (Pk{T) the characteristic functional of measure Since 
Sp T^*S = S (x, . . x) we have that 



M^) = 



^iSpT*S 






Thus (pk{S) is determined by the measure ju and hence by the characteristic 
functional (p{z) of measure fi. We now describe some methods for con- 
structing (Pk{S) by means of (p{z). Let be a form in of the type 

k 

K, zt)=n (“f Zj)- Then 

1 

<Pk(K, uj= f expji n (^> Mj)j 



To evaluate the integral appearing in the r.h.s. we note that q>{tiUi + . . . + 
+ is the joint characteristic function of the variables (x, Wi), . . (x, w^) 
and 




is f] (x, Uj) 
1 



fi{dx) 



k 

is the characteristic functional of the variable Y[ ^j) on the probability 

j= 1 

space , ®, //). If (p{tiUi + . . . + is absolutely integrable with respect 
to ti, tfc, then 




X (p{tiU^ + ... dt^ ...dtj, ds^ ...ds^. 



(1) 




§3. Measurable Mappings 



547 



Computing the integral in (1) by means of a certain regularization proce- 
dure (for example, by introducing the factor exp{ — tj} under the 
sign of the integral and then approaching the limit as a 0) we thus show 
that formula (1) is valid for arbitrary characteristic functionals q>. 

Consider the joint characteristic functional 



(Pk,i{T,z) = 




, x) + i{z, x)} n(dx). 



Assume that 



|x|^ jLi{dx)< CO. Then for any form S in the relation 



Spd^(Pk,j^(T, z) = x) exp x) + i(z, x)} j^(dx) 

is valid, where ^ (T z) is the fc-th differential of the function (Pk,i{T z) 
with respect to z (such a differential is a /c-linear form). On the other hand 

5'(xi,..., x) exp{/T(x,..., x) + /(z, x)} f^{dx) = SpdT(pk^i(T, z)*S, 



where dj^cpk^ i (7", z) is the first differential of the function (Pk^ i(T, z) with 
respect to T. 

Thus the function cpk^ ^ satisfies the differential equation 

i’‘~^dT(Pk,i{T, z) = d'‘,(pt,i{T, z). (2) 

Note also that z) can be evaluated by means of formula 

(1) if we substitute (p{tiUi + ... by <p{z-\-tiUi + ... -i-tkUk) in the r.h.s. 
of(l). 

Let V be an arbitrary measurable linear mapping of into The 
composition of the mappings 



will be called a measurable polynomial mapping of the k-th degree. Let 
R{x) be such a mapping. Then (i?(x), z) is a homogeneous polynomial of 
degree k for each z or is the limit of such polynomials in measure p,. 

Indeed if 7^(x)= VT^ where F is a measurable linear mapping of 
into then (7^(x), z)= I^*(7^), where F* is the conjugate of F (cf. 
Section 1) (this mapping maps measurable linear functionals on SC into 
measurable linear functionals on Vf is a functional on which 
is the image of (z, •) on and Vf(S) is the value obtained when Vf is 
applied to 

If Vf is a continuous functional, then Vf{Tf) is a homogeneous 
polynomial of degree k. If, however, Vf is the limit in measure of 
continuous linear functionals g^ defined on then ^„(7^)^F/(7^) in 
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measure ji, and is a homogeneous polynomial of degree k. We 

now prove the converse. Assume that i^(x) is a homogeneous polynomial 
mapping of ^ into Denote by the ^-linear form such that 

(i?(x), z) = S^{x,..., x). 

If is an element in defined by the equality 

?;(Zi,...,Zfc) = (x, Zi)...(x, Zj), 

then {R{x), z) = SpS' 2 * r. Clearly is a linear mapping of ^ into 
we denote it by U: S^=Uz. Then 

{R{x), z) = SpC/z*7; = (z, 

where U* is the conjugate of U. Hence R(x)=U*T^, where U* is a con- 
tinuous linear mapping of into 

Now let i?(x) be the limit in measure fi of continuous mappings 
of degree k. Then and R„(x) converges /i-almost every- 

where. Therefore a measurable linear mapping F of the space into 
^ exists to which F„T are convergent in measure and such that 
R(x)=VT^, 

Using function (pk{T) one can easily find the characteristic function 
of the measure v to which ^ is mapped under the measurable polynomial 
mapping R (v) of degree k : 

Indeed, 



<Pv{z) = 



i (z, R (x)) 



fi {dx) 



= F 



(z, FTpc) 



fi{dx)- 






( 3 ) 



Expansion of measurable mappings in terms of orthogonal systems of poly- 
nomials. Let the measure jll be such that the set of all polynomials be 
dense in ^2 [m] and let R{x) be a measurable mapping of ^ into ^ sat- 
isfying condition 

J |K(x)p fi{dx)<co. 

Then for each ze^ the expression (R{x), z) can be expanded into ortho- 

00 

gonal subspaces constructed in Section 2: (R{x), z)= ^ Pj^{z, x), 

k=0 

where Pk{z, Clearly, P^(z, x) depends linearly on z. We choose 
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an orthonormal basis {ej,} in Then 




\Pk(ei, x)\^ ^^(dx)^ 




(R{x), e,)^ fi{dx) = 



% 



|K(x)l^ H{dx). 



Hence the series 



Z Pk{ei,x)ei = R^(x) 

i= 1 

is convergent in measure fi and 

00 

Pfc(z. ^)= Z x) (z, 6;)= (R^(x), z). 

i= 1 

Thus under the assumptions above a measurable mapping R(x) can 
be represented in the form 



00 



i?(x)= Z Rk(x), 

k= 1 



where each one of the mappings is a measurable polynomial mapping 
of degree k and the series is mean-square convergent. For /c/i the map- 
pings Rj^ and R^ are orthogbnal in the following sense: for each bounded 
operator B the equality 

f (BR^{x), Ri{x)) n{dx) = 0 



is satisfied. Indeed, 






(BRkix), Ri{x)) fi(dx) = 




(BRk{x), ej) {cj, R,(x)) n(dx) = 
(Rfc(x), Bej) (R,(x), ej) n(dx)=0. 



since z) and {Ri{x), u) are orthogonal for k^i. Therefore 



00 



|R(x)|^ H(dx)= Z 

k=l 



|Rt(x)|^ H(dx) 



and 



00 



(R(x), zY n{dx)= Z 

k= 1 



(Rfc(x), zf ^l{dx). 



(4) 



Formula (4) enables us to calculate the correlation operator of measure 
V which is the image of /i under the mapping R{x). 
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§4. Calculation of Certain Characteristics of Transformed Measures 

In this section, results are presented which enable us to determine char- 
acteristic functionals and some other characteristics of measures ob- 
tained from the given measure by means of measurable mappings. Some 
results of this nature were presented previously in this volume. For ex- 
ample, in Section 3 of Chapter VII, a formula for the density of the 
transformed measure in terms of the original one was obtained and in 
Section 1 of the present chapter, a formula for the characteristic func- 
tional of a measure obtained from the given one by means of a measur- 
able linear transformation was presented, while in Section 3, a similar 
formula for a characteristic functional of a measure obtained from the 
given one by means of a measurable polynomial homogeneous trans- 
formation. Obviously the results presented below are not sufficient for 
solving all the problems occurring in the theory of measurable transfor- 
mations of probability measures, however, for a great number of im- 
portant problems in applications, computational algorithms for solu- 
tions may be constructed on the basis of the results presented below. 

Groups of transformations. Let a group of transformations depend- 
ing on a parameter t by the differential equation 

( 1 ) 

be defined in a Hilbert space where ^{x) is a continuous mapping of 
^ into ^ which assures the existence and uniqueness of the solution of 
equation (1) satisfying the initial condition i^o W = xe^. Then 

will also be a continuous mapping of ^ into ^ for all t. Let be 
a probability measure on (^, S) and let be the measure obtained from 
jU under the mapping (x). Denote by <p,(z) and (p{z) the characteristic 
functionals of the measures v, and /i. Then 

(Pf (z) = J (^)) ^ ^2) 

Assume that J |^(i^j(x))| /i(dx)< 00 . Then one can differentiate the 
integral appearing in the r.h.s. of (2) with respect to t and obtain 

d C C 

-<p,(z) =i (z, ^{Rt{x))) fi{dx) = i (z, ^(x)) v,{dx). 

ot J J 

Under certain additional assumptions on ^{x) the r.h.s. of the last 
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equality can be expressed in terms of (pt{z). Let 

where ^i(x) is a polynomial mapping and admits the following 

representation 



^ 2 (x) = | g{du). 



where Q{du) is a countably-additive set function of bounded variation 
defined on {^, ©) and taking on values in In this case if 

(^i(x), z)= X Hl(x,...,x), 



where XjJ are continuous /c-linear forms, then 



^i(x), z) v,(dx)= X Spd'‘<p,(z)*H^^. 



On the other hand, 



J (z, ^2 (^)) (dx) ~ J J ^ 

= j ^i(u + z, X) ^ ^ ^ J _|_ ^ 

Thus, under the assumptions imposed, (pt{z) satisfies the following in- 
tegro-dilferential equation 



= Z ^ ^Spd^(p,(z)*Hl-\-i (p,{u-\-z){z,Q{du)). (4) 

c't k=o J 

Transformations closely related to the linear. The representation of map- 
pings by means of Fourier transforms of countably-additive functions 
of bounded variation, with values in ^ may be utilized for the determi- 
nation of the characteristic functional of the transformed measure pro- 
vided the mapping is close to a linear one. Let ^(x)= Fx + 2^2 W? where 
F is a continuous linear operator, s a sufficiently small number and let 
^2 admit representation (3). Then ^2 is bounded and hence 



<?>v(z) = 



_ J(z,Vx + s^2(x)) 



= I 






IJ,{dx) = 
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Utilizing representation (3) we find 







Q{dUi))...(z,Q{du^)). 



Thus, 



J 






n{dx) = 




cp{V*z + Ui + ...+u^){z,Q{dui))...{z,Q (du^)) . 



Finally we obtain 




(p(v*z + Ui + ...+Uf)(z,Q(dUi))... (z, Q (duj) . 



k times 



( 5 ) 



Note that under our assumptions this series is convergent for all s and 
is an entire analytic function in e. Formula (5) can be substantially sim- 
plified if we assume that {z,Q(du)) is a non-negative measure and put 
is = X (provided >1>0). Let n^{du) be a measure in ^ which is infinitely- 
divisible with the characteristic functional 



J (z*, u) 






'u)=exp|2 j* 



Q(du)) 



Then 




/c times 



(p{V*Z + Ui + ...+U^){z,Q(dUi))...(z,Q (duj) = 




z + x) n^idx). 



Clearly measures q could have been defined not on ^ but on a certain 
extension of it to which the function (p{z) is extendable by continuity. 
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The duality formula and other expansions in powers of a small parameter. 



When evaluating various integrals with respect to measures in Hilbert 
spaces (in particular, measures associated with square integrable pro- 
cesses), it is sometimes convenient to use a very simple formula which 
replaces integration with respect to one measure by integration with 
respect to another. Let ^ and rj be two independent random variables 
with values in ^ and let fi and v be the distributions of these variables 
and (p^ and cp^ their characteristic functionals. Evaluating the integral 






rj). 



= jLi{dx)v{dy) in two possible ways (first with respect 

to /i and then with respect to v and conversely) we obtain 



(p^{x) fi(dx) = 



(p^y) 



( 6 ) 



This formula will be called the duality formula. A particular case of this 
formula is : 



^ i ^ [dx) = ^ ( 3 ^) ^ (dy ) , 

where v is a Gaussian measure with the correlation operator B, Formula 
(6) is especially convenient to use when the measure v is an infinitely- 

di visible distribution. Let h(x) be a functional of the form - log (p^{z), 

where v is an infinitely divisible distribution. We introduce the family of 
random variables rj^ possessing the characteristic functional = 
= E Then E - where ^ is a random variable with values in 

^ distributed according to - can be evaluated by means of formula (6): 

(7) 

This formula is valid only for t>0, in the case of negative t (or in the 
case of complex-valued t appearing in the Laplace transform) one can 
utilize the analytic continuation in t of the expression obtained. 

The duality formula can be used to obtain an expansion of the char- 
acteristic functions of functionals of random variable in powers of a 
small parameter s if the expansion in powers of s of the characteristic 
functional of the variable is known. Let 

E = = X (8) 

be given and it is required to evaluate the characteristic function of the 
variable h{^^) where h satisfies formula (7). Then E = 
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As an example for application of the formula consider the case when 
h(x) = {Bx, x) where B is a positive-definite operator. We have the equal- 
ity t^{Bx, x) = log E where rj is a, Gaussian variable with the corre- 
lation operator B (note that this variable can be defined in a certain 
extension of the space ^). Therefore 

E gis = E g-jTs («e, r,) g I' y S ^ 

The last formula is valid provided (p^{tz) is an entire analytic function 
in t. Substituting s = it, in this formula, we obtain the Laplace transform 
of the variable (B^, 4): 

E ^') = E^>,(V7 I,) = X £*= Ex,(nA >?) • 

Let the family of random variables fg depending on a positive param- 
eter s be such that in probability as ^^0. Next assume that for the 

characteristic functional (p^{z) of the variable (^g = - ^g, expansion (8) is 

valid. Furthermore, let the functional h{x) admit representation of the 

00 

form h{x)= ^ Pk{^) where Pk{x) is a homogeneous polynomial of de- 

k=i 

gree k and is its generating fc-linear form. We find the expansion in 

powers of ^ of the characteristic functional of the variable - /i(^g) under 

the assumption that expansion (8) is termwise infinitely differentiable. 
We have 

Eexp|“/i(f,)|=Eexp|^ f; Pfc(^a)|= E expj/s = 

00 1 n 00 ~ n 

= Eexp{isFi(4)}X E = 

n = 0 L k=2 

00 2n 

= Eexp{js Fi(x) X Z Qnk{Qr„^{is)- 

n=0 k=0 

Here Q„k(x) are homogeneous polynomials of degree k, and r„k{t) are 
numerical (real-valued) polynomials of degree at most k/2. These poly- 
nomials are uniquely determined by means of the relations 

E Qnk{x)r„k{t)=^„exp\t f; e'‘-iPk(x)j 

k=0 1 . k = 2 J E = 0 

Assume that (x) = (a, x) where a^O and Q„^(x)= 7^^(x, ..., x), where 
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^nk is a /c-linear form. Then 

E exp{isP,(Q} e„,(4)= E Q = 

= 1-* Sp d'‘(p,(sa)*T„^=r'‘ X e™ Spd''Zm (sa)*T„fc. 

m 

Consequently, 

fiS _ ^ oo 2n 00 

Eexp-^-/ife)[= X; e" X r„k{is)r’‘ Z fi”* Sprf'‘XmH*r„t. (9) 

Expansion (9) should be rewritten collecting the coefficients with the 
common powers of s. In the same manner, one can also obtain a finite 
expansion with a bound on the remainder if in place of a series in (8) 
we take a finite expansion with a remainder and represent h{x) in the 
form 



Z Pk{x)+o{Tff+i{x)), 
k= 1 

where is a homogeneous polynomial of the N+l-th degree. 

A formula analogous to (9) can be used to determine the character- 
istic functional of the transformed measure provided the transformation 
R{x) admits representation 



00 

R{x)= Z Rk{x), 



k= 1 



(10) 



where Rk{x) are homogeneous polynomial transformations of the k-ih 
degree. Let be a linear mapping of ^ into which maps z into the 
form Vk{z) which generates the polynomial {Rk{^), z). Then 



I' 



,i (z, R (x)) 



fi{dx) = 



„ = 0 \ k = 2 



x) 



fi{dx). 



If we denote by cp^ the characteristic function of the measure v, then 

00 

<p^{z)= Z Spd'‘(p,{V,z)*T,^, (11) 

k = 0 



where is a /c-linear form given by 



r/=Z 



•ni+...+nj 



— ^ M ! VI f 

j m + 2/12 + --. +jwj = fc 1. . ./Ij . 



V°'"{z)...Vp{z)*. 



(12) 



* stands for F... V. 



k times 
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Expansion (11) is meaningful if measure fi satisfies the following condi- 
tion: for all m and t 



I ® I 

I* t|l^k(JC)| 

e "" ju(dx)<oo. 



An application of orthogonal polynomials. Let /i be a measure for which 
the orthogonal polynomials Pk{Hk, x) are constructed (cf Section 2). 
We shall investigate how this fact assists us in finding the charac- 
teristic functional (p^ of measure v obtained from fj, by transforma- 
tion R{x). Assume that we have an expansion of function into 

00 

a series of orthogonal polynomials : e' ^ = X ^)- Then cp^ (z) = 

r k=o 

= = Pq(Hq). Hence the problem of expanding 

is not simpler than determining (p^ (z). It is more natural to use orthogonal 
polynomials in the case when v is absolutely continuous with respect to 



fi and when the density ^(x) =q{x) belongs to if 2 M- Assume that the 
expansion of q{x) in terms of orthogonal polynomials is known: ^(x) = 

00 

= ^ P,{H„ x). Then 



<?v(z) = | e' 



(2, x) 



q{x) jj,{dx)- 



oi (z, X) 



Z Pki^k, x)fi{dx) = 



k = 0 



00 p 

= Z 

k = 0 J 



Pi,(H^, x)fi(dx). 



Clearly, J Q(x) ju(dx), where Q is a polynomial which can be easily 

expressed in terms of as a differential operator of the form 






Therefore it can be assumed that the functions 

Xk(H„z) = j e‘<^-^^P,(H„x)ju(dx), 

are determined. Then (Py(z) is expressed in terms of these functions by 
formula 

00 

<pA^)=Z Xk{Hk,z). 

k = 0 



(13) 




§4. Calculation of Certain Characteristics 



557 



Expressions for the density of the transformed measure in terms of 
the original one are presented in Section 3 of Chapter VII. This density 
is expressed in terms of the function ^(a, x) which is the density of the 
shifted measure relative to the initial measure ja. 

We now present a method for obtaining an expansion of g{a, x) into 
a series of orthogonal polynomials. Assume that 

00 

Q{a,x)=J]P,{Hlx), (14) 

k = 0 



where is a /c-linear form dependent on a. Then for every polynomial 
Pi{Hi, x) in (cf. Section 2) the following relationship 



P,{H,,x)P,(Hl x)ii{dx)^ 



Pk{Hk, x)fi{dx) = 



Pk (Hj,, X + a) /i (dx) = (Hk) 



is satisfied. The linear functional on can be easily evaluated if 

we expand the polynomial Pk{Hk, x + a) by means of the Taylor theorem! 
and use the expansion of each one of the polynomials obtained into a 
series of orthogonal polynomials. Obviously, the linear function S^{Hk) 
on 0^ can be represented by means of the scalar product 



(cf. Section 2). This relation uniquely determines HI and thereafter the 
function ^(a, x) is determined by formula (14). In order that g{a, x) exist 
and belong to [/i] it is necessary and sufficient that series (14) be 
convergent, i.e. that the inequality 



z 



k=l 






be satisfied. 




Historical and Bibliographical Remarks 



The remarks presented below contain a number of references to the literature dealing with 
the problems discussed in the book. They are not intended to present a complete bibliog- 
raphy or to sketch the history of the basic ideas in the theory of random processes. In many 
cases we do not refer to the original obscure publications but rather to more recent text- 
books and monographs which contain a bibliography on the topics under consideration. 



Chapter I 

§ 1 . The exposition is based on the by now commonly accepted - set - theoretical axiom- 
atization of probability theory as suggested by A. N. Kolmogorov in 1929 and presented 
in his monograph [60] and [58]. In connection with the measure - theoretic results and the 
theory of integrations (used in this volume) see the texts by A. N. Kolmogorov and S. V. 
Fomin [70], P. Halmos [42], I. I. Gihman and A. V. Skorohod [33], J. Neveu [80] and 
P. A. Meyer [77]. 

§2. The general 0-1 law was established by A. N. Kolmogorov {60}. 

§3. The theory of conditional probabilities and conditional mathematical expectations 
was introduced by A. N. Kolmogorov [60]. It was further developed by J. L. Doob [20]. 
See also M. Loeve [74] and J. Neveu [80]. 

§4. The basic theorem is due to A. N. Kolmogorov. 



Chapter II 

§2. Martingales were discussed by various authors but the systematic theory of this notion 
is due to J. L. Doob [20]. He first derived the basic inequalities for martingales, proved 
the theorem on the existence of the limit, introduced the notion of a semimartingale and 
also obtained other results. More information on martingales can be found in the books 
by J. L. Doob, M. Loeve and P. A. Meyer quoted above. 

§3. The basic ideas and results presented in this section are due to A. N. Kolmogorov 
and A. Ya. Khinchin [54] and to A. N. Kolmogorov [57]. Series of independent random 
variables are discussed in more detail in the books by J. L. Doob [20], M. Loeve [74] and 
A. V. Skorohod [101]. 

§4. Markov chains with a finite number of states were introduced in (1906) and studied 
by A. A. Markov [76]. The general definition of Markov chains and processes is due to 
A. N. Kolmogorov [64]. More general approaches are developed in E. B. Dynkin’s 
monographs [23] and [24]. 

§5. Markov chains with a countable number of states were first studied in the works 
of A. N. Kolmogorov [62, 63], W. Doeblin [17] and later were investigated by numerous 
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authors. See W. Feller [28], K. L. Chung [12], E. B. Dynkin and A. A. Yushkevich [25], 
J. S. Kemeny, J. L. Snell and A. W. Knapp [52], 

§6. Random walks were studied by various authors and many results are known in 
this field. See W. Feller, E. B. Dynkin and A. A. Yushkevich [28], A. V. Skorohod and 
N. P. Slobodenyuk [103] and F. Spitzer. 

§7. B. V. Gnedenko [36] was the first to study the local limit theorems for lattice one- 
dimensional distributions. See B. V. Gnedenko and A. N. Kolmogorov, I. A. Ibragimov 
and Yu. V. Linnik [38], A. V. Skorohod and N. P. Slobodenyuk [103]. 

§8. Ergodic theorems originated in relation to problems in statistical mechanics. See 
A. Ya. Khinchin’s book [55] in this connection. The first ergodic theorems due to J. von 
Neumann and G. Birkhoff served as the beginning of an intensive development of the 
theory. A survey of the first period of the development of ergodic theory is contained in 
E. Hopfs monograph [44]. A simple proof of the Birkhoff-Khinchin theorem was given 
by A. N. Kolmogorov [65]. Further developments in ergodic theory are discussed in books 
by P. Halmos [43], K. Jacobs [48] and P. Billingsley [6]. 



Chapter III 

§ 1 . A multi-dimensional generalization of the central limit theorem was first considered 
by S. N. Bernstein [5]. B. de Finneti [29] originated the systematic study of processes with 
independent increments. The characteristic function of a process with independent in- 
crements in the case of finiteness of the second-order moment was obtained by A. N. 
Kolmogorov [59] and in the general case by P. Levy [73] (both are univariate). See also 
remarks to sec. 4 in Chapter II in connection with the general definitions of Markov pro- 
cesses. 

§§2 and 3. The feasibility of constructing a random process stochastically equivalent 
to the given one with sample functions satisfying certain regularity conditions was first 
investigated by E. E. Slutzky and A. N. Kolmogorov (cf. E. E. Slutzky’s paper [105]). 
Many substantial results are due to J. L. Doob in connection with further developments 
and various versions of the axiomatic definition of random functions. References to earlier 
papers are found in Doob’s monograph [20]. The basic theorems in section 2 and 3 are 
due to J. L. Doob. See also E. E. Slutzky [105]. 

§4. Theorem 1 in a somewhat weaker form was proved by N. N. Chencov [11]; theorem 
2 by J. H. Kinney [56] (in the case of Markov processes). P. Levy [73] established the 
absence of discontinuities of the second kind in the case of stochastically continuous 
processes with independent increments. J. L. Doob [20] studied the properties of sample 
functions of martingales. 

§ 5. Theorem 2 was proved by E. B. Dynkin [22] and independently by J. H. Kinney [56] 
(for Markov processes). A somewhat weaker version of theorem 6 is due to A. N. Kol- 
mogorov and was first published in E. E. Slutzky’s work [105]. Yu K. Belyaev [3, 4] 
studied local properties of Gaussian processes. See also the book by H. Cramer and M. R. 
Leadbetter [15]. 



Chapter IV 

§§ 1 and 2. A. Ya. Khinchin [53] introduced the notion of a wide-sense stationary process. 
In the same paper the spectral representation of the correlation function of a wide-sense 
stationary process was presented. F. Riesz and G. Herglotz obtained in 1911 the spectral 
representation of a positive definite sequence and S. Bochner [8] obtained in 1932 the 
representation for positive definite functions. J. L. Schonberg’s work contains the spectral 
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representation of a homogeneous and isotropic random field in Euclidean and Hilbert 
spaces. 

§3. E. E. Slutzky [104] and M. Loeve [74]. 

§4. The theory of stochastic integrals was introduced by H. Cramer [13]. A. N. Kol- 
mogorov was the first to clarify the connection between stochastic integrals, spectral repre- 
sentations and methods in the theory of Hilbert spaces [68]. See also J. L. Doob [20]. 

§5. Theorem 1 is due to K. Karhunen [50] and Theorem 2 is due to H. Cramer [13]. 

§6. Using the theory of filters the spectral representation of a stationary process can 
be easily obtained (A. Blanc-Lapierre and R. Fortet [7]). A more general theory of linear 
transformations of random processes may be constructed by means of the theory of gener- 
alized random process introduced by I. M. Gelfand and K. Ito (I. M. Gelfand and N. Ya. 
Vilenkin [30], K. Ito [47]). 

§7. Basic results for the case of stationary sequences are due to A. N. Kolmogorov [68] 
and for continuous parameter processes are due to K. Karhunen [51] (see J. L. Doob [20] 
and Yu A. Rozanov [91]). 

§8. The general formulation of the problems of linear forecasting (for stationary 
sequences) its connection with the geometry of Hilbert spaces and its reduction to a problem 
in the theory of functions is due to A. N. Kolmogorov. N. Wiener developed efficient meth- 
ods for solving problems of linear forecasting and filtering for continuous-parameter 
processes. A. M. Yaglom’s method with a large number of examples is presented in his 
monograph [116]. 

§9. The theorem on decomposition of a stationary process and the notions of deter- 
minate and undeterminate processes are due to H. Wold. The general solution of the 
problem of forecasting a stationary sequence from its past was obtained by A. N. Kolmo- 
gorov and for continuous parameter processes by M. G. Krein [71, 72]. The problem of 
forecasting a vector-valued stationary sequence was discussed by Yu A. Rozanov [89], 
N. Wiener and P. Masani [115]. Details on forecasting continuous parameter processes 
are given in the books by J. L. Doob [20] and Yu A. Rozanov [91]. 



Chapter V 

The construction of a measure in a functional space was first carried out by N. Wiener 
[112]. The general method of constructing such measures is due to A. N. Kolmogorov [29]. 
Measures in Banach and complete metric spaces were studied in works of A. N. Kolmo- 
gorov, E. Mourier [79], Yu V. Prohorov [85] and K. R. Parthasarathy [81]. 

§3. The construction of an extension of the initial spaces on which there exists a 
measure with a given positive definite function for its characteristic functional is due to 
L. Gross [40]; the theorem in Section 3 is due to E. Mourier [79]. 

The theorem in Section 5 is due to V. V. Sazonov [94] and R. A. Minlos [78]; general- 
ized measures on a Hilbert space were introduced by Yu A. Daletzkii [16]. 

The theorem in Section 5 is due to V. V. Sazonov [94] and R. A. Minlos [78] ; general- 
Vershik [1 10] ; he also studied linear and quadratic functionals measurable with respect to 
these measures. Multiple stochastic integrals were constructed by K. Ito [46]. Yu A. 
Rozanov [93] obtained the general form of linear and quadratic functions on stationary 
Gaussian processes. 



Chapter VI 

§ 1 . The proof of the sufficiency of conditions of Theorem 1 is due to Yu V. Prohorov [85]. 

§2. The condition of weak compactness of measures on a Hilbert space was established 
by K. R. Parthasarathy [81]. 
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§3. The general form of an infinitely divisible distribution and the condition for con- 
vergence of a distribution of a sum of independent random variables with value in a Hilbert 
space to such a distribution were obtained by S. R. S. Varadhan [108]. Conditions for 
convergence to a Gaussian distribution were studied by N. A. Kandelaki and V. V. Sazonov 
[49]. A sufficiently detailed exposition of these results is given in K. R. Parthasarathy’s 
book [81]. 

§4. M. Donsker [18] initiated the study of general limit theorems for random processes ; 
his result is stated in Theorem 4. Theorems 1-3 are due to Yu V. Prohorov. 

§5. The first limit theorem for processes without discontinuity of the second kind is due 
to I. I. Gihman [31]. The space I>[o, i] and the limiting theorems for processes on this 
space were studied by A. V. Skorohod [98]. The convergence in D^q was investigated by 
A. N. Kolmogorov [68] and Yu V. Prohorov [85]. An interesting limit theorem was ob- 
tained by N. N. Chencov [11]; theorem 3 is a minor modification of this theorem. The 
convergence of processes with independent increments and Markov processes was studied 
by A. V. Skorohod [98, 99, 100]. An application of limit theorems to statistical problems 
was considered by M. Donsker and I. I. Gihman [19]. 



Chapter VII 

Various problems of absolute continuity of measures in functional spaces are discussed in 
I. I. Gihman and A. V. Skorohod’s [34] paper. 

§2. Measures with an everywhere dense set of admissible shifts were considered by 
V. N. Sudakov [107]. The structure of the set of admissible shifts was studied by T. S. 
Pitcher. Theorem 4 is stated in A. M. Vershik’s paper. 

§3. Certain results on absolute continuity of Gaussian measures in Hilbert spaces 
under non-linear transformations are given in the paper by V. V. Baklau and A. D. Shatash- 
vili [2]. 

§4. The condition for absolute continuity and the formula for the density of a Gaussian 
measure under a shift were obtained by U. Grenander [39]. General conditions for absolute 
continuity and singularity for Gaussian measures are found in Ja. Hajek’s, J. Feldman’s 
and Yu A. Rozanov’s papers [41, 26, 92]. 

§ 5. Theorems 1 and 2 are due to Yu A. Rozanov. Basic results in this field are presented 
in Yu A. Rozanov’s book [93]. 

§6. Absolutely continuous mappings of certain classes of Markov processes were 
considered by I. V. Girsanov [35] and A. V. Skorohod [100]. General theorems on absolute 
continuity measures associated with processes with independent increments and with 
Markov processes are presented in 1. 1. Gihman and A. V. Skorohod’s paper [34]. 



Chapter VIII 

§ 1 . Measurable linear operators and linear functionals are studied in the book by G. E. 
Shilov and Fan Dyk Tan [96]. 

§2. The orthogonal system of polynomials for the Wiener measure was constructed in 
the paper by K. Ito [46] ; various applications of these polynomials are given in N. Wiener’s 
book. Orthogonal polynomials for the Gaussian measure were constructed by A. M. 
Vershik [110]. 




Bibliography 



1. Akhiezer, N. I., Glazman, I. M.; Theory of Linear Operations in Hilbert Spaces, 
N.Y.: Frederick Ungar Publishing Co., 1966 [English translation]. 

2. Baklan, V. V., Shatashvili, A. D. : Transforms of Gaussian Measures under non-linear 
transformations in Hilbert spaces. Dopovidi, A. N. URSR 9, 1115-1117 (1965) [in 
Ukrainian]. 

3. Belyaev, Yu. K.: Local properties of sample function of stationary Gaussian pro- 
cesses. Theor. Probability Appl. 5, 128-131 (1960). 

4. Belyaev, Yu. K. : Continuity and Holder’s conditions for sample functions of station- 
ary Gaussian processes. Proc. Fourth Berk. Symp. on Math. Stat. and Probability 
2, 23-33 (1961). 

5. Bernstein, S. : Sur I’extension du theoreme limite du calcul des probabilites aux som- 
mes de quantites dependentes. Math. Ann. 97, 1-59 (1926). [Russian translation: 
Uspehi Mat. Nauk 10, 65-114 (1944)]. 

6. Billingsley, P. : Ergodic Theory and Information. N.Y. : J. Wiley 1965. 

7. Blanc-Lapierre, A., Fortet, R. : Theorie des fonctions aleatoires. Paris : Masson et Cie. 
1953. 

8. Bochner, S.: Lectures on Fourier Integrals, Princeton, 1959. 

9. Cameron, R. H., Martin, W. T.: Transformations of Wiener integrals under a gen- 
eral class transformation. Trans. Amer. Math. Soc. 58, 184-219 (1945). 

10. Cameron, R. H., Martin, W. T. : Transformation of Wiener integrals by nonlinear 
transformation. Trans. Amer. Math. Soc. 66, 253-283 (1949). 

1 1 . Chencov, N. N. : Weak convergence of stochastic processes whose trajectories have 
no discontinuities of the second kind and the heuristic approach to the Kolmogorov- 
Smirnov tests. Theor. Probability Appl. 1, 140-149 (1956). 

12. Chung, K. L.: Markov chains with stationary transition probabilities. Berlin - Got- 
tingen - Heidelberg : Springer 1960. 

13. Cramer, H. : On the theory of stationary random processes. Ann. Math. 41, 215-230 
(1940). 

14. Cramer, H. : On stochastic processes whose trajectories have no discontinuities of the 
second kind. Ann. di Matematica (iv) 71, 85-92 (1966). 

15. Cramer, H., Leadbetter, M. R. : Stationary and related stochastic processes, N.Y. : 
J. Wiley 1967. 

16. Daletskii, Yu. L. : Infinite-dimensional elliptic operators and parabolic equations con- 
nected with them. Russian Math. Surveys 22, No. 4, 3-53 (1967). 

17. Doeblin, W. : Sur les proprietes asymptotiques de mouvement regis par certain types 
de chaines simples. Bull. Math. Soc. Sci. Math. R. S. Roumanie 39, No. 1, 57-115; 
No. 2, 3-61 (1937). 

18. Donsker, M.: An invariance principle for certain probability limit theorems. Mem. 
Amer. Math. Soc. 6, 1-12 (1951). 




Bibliography 



563 



19. Donsker, M.: Kolmogorov-Smirnov theorems. Ann. Math. Statist. 23, 277-281 
(1952). 

20. Doob, J. L. : Stochastic processes. N.Y. : J. Wiley 1953. 

21. Dunford, N., Schwartz, J. T. : Linear operators I, II. New York: Interscience Pub- 
lishers 1958, 1963. 

22. Dynkin, E. B. : Criterion for continuity and absence of discontinuities of the second 
kind for trajectories of a Markov random process. Izv. Akad. Nauk Armjan. SSR 
Ser. Mat. 16(1952). 

23. Dynkin, E. B. : Foundations of the theory of Markov processes. Fismatgiz. (1959). 

24. Dynkin, E. B.: Markov processes. Vols. I and II. N.Y. : Academic Press 1965. 

25. Dynkin, E. B., Yushkevich, A. A.: Theorems and Problems in Markov Processes. 
Moscow: Nauka 1967. 

26. Feldman, J. : Equivalence and perpendicularity of Gaussian processes. Pacific J. Math. 
8, 699-708 (1958). 

27. Feldman, J. : Some classes of equivalent Gaussian processes on interval. Pacific J. 
Math. 10 , 1211-1220(1960). 

28. Feller, W. : An introduction to probability theory and its application. N.Y. : J. Wiley, 
Vol. 1 (1957), Vol. 11 (1966). 

29. Finneti, B. : Sulle funzioni a incremento aleatorio. Rend. Acad. Naz. Lincei, Cl. Sci. 
Fis. Mat. Nat. (6), 10 , 163-168 (1929). 

30. Gelfand, I. M., Vilenkin, N. Ya. : Applications of harmonic analysis; Saturated 
Hilbert spaces. Fizmatgiz. (1961). 

31 . Gihman, 1. 1. : On a theorem of Kolmogorov. Nauch. Zap., Kiev, Un-ta, Mat. Shorn. 
7 , 76-94(1953). 

32. Gihman, I. I.: Markov processes in problems of mathematical statistics. Ukrainian 
Math. J. 6, 28-36(1954). 

33. Gihman, I. I., Skorohod, A. V.: Introduction to the theory of random processes. 
Fizmatgiz. (1965). 

34. Gihman, I. I., Skorohod, A. V.: On densities of probability measures in functions 
spaces, Russian Math. Surveys (Uspehi Mat. Nauk), XXI, 6, 83-156 (1966). 

35. Girsanov, I. V. : On transforming a class of random processes by absolutely continu- 
ous substitutions of measures. Theory prob. and its Applic. 5, 314-334 (1960). 

36. Gnedenko, B. V. : On a local theorem for the limit stable distributions. Ukrainian 
Math. J. 1,3-15(1949). 

37. Gnedenko, B. V. : The Theory of Probability. English translation of the fourth edition. 
New York: Chelsea 1967. 

38. Gnedenko, B. V., Kolmogorov, A. N. : Limit Distributions for sums of Independent 
Random Variables, Reading, Mass.: Addison Wesley 1954. 

39. Grenander, U.: Stochastic processes and statistical inference. Ark. Mat. 1 , 195-277 
(1950). 

40. Gross, L. : Harmonic analysis on Hilbert space. Mem. Amer. Math. Soc. 46, 1-62 
(1963). 

41 . Hajek, J. : On a property of normal distribution of a stochastic process. Czechoslovak. 
Math. J. 8, 610-618 (1958) [Russian-English summary]. 

42. Halmos, P. R. : Measure Theory. Princeton, N.J. : D. van Nostrand 1950. 

43. Halmos, P. R. : Lectures on Ergodic Theory. J. Math. Soc. Japan, No. 3 (1956). 

44. Hopf, E. : Ergoden Theory. Ergebnisse der Math., Vol. 2. Berlin: J. Springer 1937 
(Reprinted by Chelsea Publishing Co., N.Y. 1948). 

45. Ibragimov, I. A., Linnik, Yu. V.: Independent and stationary associated variables. 
Moscow: Nauka 1965 [English translation]. 

46. Ito, K. : Multiple Wiener Integral. J. Math. Soc. Japan, 3, 157-169 (1951). 

47. Ito, K. : Stationary random distributions. Mem. Coll. Sci. Univ. Kyoto 28, 206-223 
(1954). 




564 



Bibliography 



48. Jacobs, K. ; Neuere Methoden und Ergebnisse der Ergodentheorie. Berlin - Gottingen 
- Heidelberg: Springer 1960. 

49. Kandelaki, N. P., Sazonov, V. V. : On the central limit theorem for random elements 
with values in Hilbert space. Theory of Prob. and Applic. 9, No. 1, 38-46 (1964). 

50. Karhunen, K. : Uber lineare Methoden in der Wahrscheinlicherechnung, Ann. Acad. 
Sci. Fennicae, Ser. A. Math. Phys. 37, 3-79 (1947). 

51 . Karhunen, K. : Uber die Struktur stationaeren zufaelliger Funktionen, Ark. Math. 1 , 
141-160 (1950). 

52. Kemeny, J. G., Snell, J. L., Knapp, A. W. : Denumerable Markov chains. N.Y.-L. : 
Van Nostrand 1966. 

53. Khinchin, A.: Correlation theory of stationary random processes. Usp. Mat. Nauk, 
5, 42-51 (1938). 

54. Khinchin, A., Kolmogorov, A. N. : Uber Konvergenz von Reihen deren Glieder durch 
den Zufall bestimmt werden. Matem. Sb. 32, 668-677 (1925). 

55. Khinchin, A.: Mathematical Foundations of Statistical Mechanics. N.Y. : Dover 
Publications 1949. 

56. Kinney, J. H. : Continuity properties of sample functions of Markov processes. Trans. 
Amer. Math. Soc. 74, 280-302 (1953). 

57. Kolmogorov, A. N. ; Uber die Summen durch den Zufall bestimter unabhangiger 
GroBen. Math. Ann. 99, 309-319 (1928); 100, 484-488 (1929). 

58. Kolmogorov, A. N. : General Measure theory and calculus of probabilities. Trudy 
Komm. Akad., Math. Division 1, 8-21 (1929). 

59. Kolmogorov, A. N. : Sulla forma generale di un processo stocastico omogeneo, Atti. 
Accad. Lincei 15, 805-808, 866-869 (1932). 

60. Kolmogorov, A. N. : Foundations of the theory of probability. N.Y. : Chelsea Press. 
[The German original appeared in 1933; Russian version 1936]. 

61. Kolmogorov, A. N. : La transformation de Laplace dans les lineaires. Compt. Rend. 
Acad. Sci. (Paris) 200, 1717 (1935). 

62. Kolmogorov, A. N. : Anfangsgriinde der Theorie der Markoffschen Ketten mit un- 
endlichen vielen moglichen Zustanden, 1, 607-610 (1936). 

63. Kolmogorov, A. N. : Markov chains with a countable number of possible states. Bull. 
Math. Univ. Moscow 1 , 1-16 (1937) [in Russian]. 

64. Kolmogorov, A. N. : Uber die analytischen Methoden in Wahrscheinlichkeitsrech- 
nung. Math. Ann. (104), 1931 [Russian transl. : Usp. Mat. Nauk 5, 5-41 (1938)]. 

65. Kolmogorov, A. N.: Simplified proof of the Birkhoff Khinchin ergodic theorem. 
Uspekhi Math. Nauk 5, 52-56 (1938) [in Russian]. 

66. Kolmogorov, A. N. : Curves in Hilbert spaces invariant relative to one-parametric 
group of motions. Dokl. Akad. Nauk 26, 6-9 (1940). 

67. Kolmogorov, A. N. : Wiener’s spiral and some other interesting curves in Hilbert 
spaces. Dokl. Akad. Nauk 26, 115-118 (1940). 

68. Kolmogorov, A. N. : Stationary sequences in Hilbert space. Bull. Math. Univ. Moscow 
2, No. 6, 1-40 (1941) [in Russian]. 

69. Kolmogorov, A. N. : On Skorohod’s convergence*. Theor. Probability Appl. 1 , 239- 
247(1956). 

70. Kolmogorov, A. N., Fomin, S. V. : Elements of the theory of functions and functional 
analysis. Sec. ed. Moscow: Nauka 1968. 

71. Krein, M. G. : On an Extrapolation Problem of A. N. Kolmogorov. Dokl. Akad. 
Nauk SSSR 46, 306-309 (1944). 

72. Krein, M. G. : On the Basic Approximation Problem in the Theory of Extrapolation 
and Filtering of Stationary Random Processes. Dokl. Akad. Nauk SSSR 94, 13-16. 

73. Levy, P.: Sur les integrales dont les elements sont des variables aleatoires indepen- 
dentes, Ann. Scuola Norm. Sup. Pisa 2, No. 3, 337-366 (1934). 

74. Loeve, M.: Probability Theory, 2nd Ed. Princeton, N.J. : D. van Nostrand, 1960. 




Bibliography 



565 



75. Lyusternik, L. A., Sobolev, V. I. : Elements of Functional Analysis. N.Y. ; Ungar 1961 . 

76. Markov, A. A. : Extension of the law of large numbers to dependent events. Bull. Soc. 
Phys. Math. Kazan (2) 15, 155-156 (1906) [in Russian]. 

77. Meyer, P. A. : Probability and Potentials. 1966. 

78. Minlos, R. A.: Generalized stochastic processes and their extension with respect to 
measure. Trudy Moscow Math. Soc. 8, 497-518 (1959). 

79. Mourier, E.: Elements aleatoires dans un espace de Banach. Ann. Inst. H. Poincare 
13(1953). 

80. Nevue, J. : Mathematical Foundations of the Calculus of Probability. San Francisco : 
Holden-Day 1965. 

81. Parthasarathy, K. R.: Probability Measures on Metric Space. N.Y.-L.: Academic 
Press 1967. 

82. Pinsker, M. S. : Information and information stability of random variables and pro- 
cesses. San Francisco: Holden Day 1964 [English transl.]. 

83. Pitcher, T. S.: The admissible mean values of stochastic process. Trans. Amer. 
Math. Soc. 108 , 538-546 (1963). 

84. Privalov, I. I.: Boundary properties of Analytic Functions. Moscow-Leningrad 
1950 [German translation: VEB Deutscher Verlag der Wissenschaften, Berlin, 1956]. 

85. Prohorov, Yu. V. : Convergence of random processes and limit theorems in probability 
theory. Theory of Prob. and its applic. 1 , 187-214 (1956). 

86. Prohorov, Yu. V.: The method of characteristic functionals. Proc. 4th Berkley 
Symp. 2,403-419(1961). 

87. Prohorov, Yu. V., Sazonov, V. V.: Some results associated with Bochner’s theorem. 
Theor. Probability Appl. 6, 82-87 (1961). 

88. Prohorov, Yu. V., Fish, M.: A characterization of normal distributions in Hilbert 
space. Theor. Probability Appl. 2, 468^70 (1957). 

89. Rozanov, Yu. A. : Spectral theory of multi-dimensional stationary random processes 
with discrete time. Uspehi Mat. Nauk 13, 2 (80), 93-142 (1958). 

90. Rozanov, Yu. A.: On the density of one Gaussian measure in relation to another. 
Theor. Probability Appl. 7, 84-89 (1962). 

91. Rozanov, Yu. A.: Stationary Random Processes. San Francisco; Holden Day 1967. 

92. Rozanov, Yu. A.: On the density of Gaussian distributions and Wiener-Hopf 
integral equations. Dokl. Akad. Nauk SSSR 165, 1000-1002 (1965), [Soviet Math. 
Dokl. 6, 1551-1553 (1965)]. 

93. Rozanov, Yu. A. : Gaussian infinitely dimensional distributions, Steklov Math. 
Institute Public, 108 , 1-136 (1968). 

94. Sazonov, V. V. : A remark on characteristic functionals. Theor. Probability Appl. 3, 
188-192 (1958). 

95. Schoenberg, J. L. : Metric spaces and completely monotone functions. Ann. Math. 39, 
811-841 (1938). 

96. Shilov, G. E., Fan Dyk Tan: Integral, measure and derivatives on linear spaces. 
Moscow: Nauka 1967. 

97. Skorohod, A. V. : Limit theorems for stochastic processes. Theor. Probability Appl. 1 , 
261-290 (1956). 

98. Skorohod, A. V.: Limit theorems for stochastic processes with independent incre- 
ments. Theor. Probability Appl. 2, 138-171 (1957). 

99. Skorohod, A. V.: Limit theorems for Markov processes. Theor. Probability Appl. 3, 
202-246 (1958). 

100. Skorohod, A. V.: Studies in the theory of random processes. Reading, Mass.: 
Addison Wesley: 1965. 

101. Skorohod, A. V.: Random processes with independent increments. Fizmatgiz, 1964. 

102. Skorohod, A. V. : On the densities of probability measures in functional space. Proc. 
5th Berkley symp. 2, 163-182 (1965). 




566 



Bibliography 



103. Skorohod, A. V., Slobodenyuk, N. P. : Limit theorems for random walks, UkrSSSR; 
Naukova Dumka 1970. 

104. Slutzky, E. E. : Sur les fonctions eventuelles continues, integrables et derivables dans 
le sens stochastique, Comptes Rendus Acad. sci. 187, 370-372 (1928). 

105. Slutsky, E. E. : Some statements concerning the theory of random processes. Publ. 
of Middle- Asian University. Mat. Series (5), 31, 3-15 (1949). 

106. Spitzer, F. : Principles of random walk. Princeton, N. J.: D. van Nostrand 1964. 

107. Sudakov, V. N. ; Linear spaces with quasi-invariant measure. Dokl. Akad. Nauk 127 , 
524-525 (1959). 

108. Varadhan, S. R. S.: Limit theorems for sums of independent random variables with 
values in a Hilbert space. Sankhya 24 , 213-238 (1962). 

109. Vershik, A. M. : On the theory of normal dynamic systems, Dokl. Akad. Nauk 144 , 
9-12 (1962). 

110. Vershik, A. M.: General theory of Gaussian measures in linear spaces. Uspekhi 
Math. Nauk 19 , 210-212 (1964). 

111. Vershik, A. M. : Duality in the theory of measure in linear spaces. Dokl. Akad. Nauk 
170 , 497-500 (1966). 

112. Wiener, N. Differential space, J. Math. Phys. Mass. Inst. Tech. 2 , 131-174 (1923). 

113. Wiener, N. : Extrapolation, interpolation and smoothing of stationary time series. 
N.Y. 1949. 

1 14. Wiener, N. : Nonlinear problems in random theory. M.I.T. and J. Wiley 1958. 

115. Wiener, N., Masany, P. : Prediction theory of multivariate stochastic processes. Acta 
Math. 98 , 111-150 (1957); 99 , 93-137 (1958). 

116. Yaglom, A. M.: An Introduction to the theory of Stationary Random Functions, 
Englewood Cliffs, N.J. ; Prentice-Hall 1962. 




Corrections 



Pp. 340-341. The proof of the boundedness of m(z) presented on pp. 340-341 is 
erroneous. Below we present a corrected proof. 

Since m„(z)tm(z) and m„(z) is continuous, m(z) is continuous from below. 
Moreover, by virtue of Minkowski’s inequality, 

[m(zi + / 2 )]‘'''‘ = [| |(x, Zi + Z 2 t ix(dx)f''" 

«[| |(^, Z,)\%idx)r'^+[\ l(*, Z2t,xidx)Y^’^ 

Thus [ot(z)]*^* is a semiadditive function continuous from below and hence in 
view of the theorem by I. M. Gelfand (see, e.g., L. V. Kantorovich and G. P. 
Akilov, Functional Analysis in Normed Spaces, Moscow, Fizmatgiz, 1959, p. 233) 
there exists a constant M such that [m(z)Y^^ ^M\z\. 



P. 408. Omit the last line on this page. 

Pp. 525-531. In Section 1 of Chapter VIII two theorems on measurable linear 
functions and their operators are presented. As stated therein the theorems are 
not correct. We now present the correct statements and proofs. 

A function /(x) is called a measurable linear functional with respect to measure 
on a Hilbert space {X, 33) if l{x) is the limit in measure /it of a sequence of 
continuous linear functionals /„(jc). 



Theorem 1. In order that a ‘iQ- measurable function l{x) be a measurable linear 
functional with respect to measure p it is necessary and sufficient that a symmetric 
convex compact set K exist such that the following conditions be satisfied: 

1) if Q) is a linear hull of K, then p{Q))= 1; 

2 ) l(x) is linear on Q)\ 

3) l(x) is continuous on K. 
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Proof. Necessity. If /(;c) is a measurable linear functional then a sequence of 
continuous functionals /„(jc) exists such that 

l(x) = fjL-\im ln(x) 

n-»oo 

and 

|/„(A:)-/„+i(Ar)| 

Set Clearly, ‘Sk is a symmetric convex 

closed set and 

n^k '^1 n I n^k n 

Since /„(x) is uniformly convergent to l(x) on each one of the sets it is 
convergent to l{x)on the linear hull of the set ^k and hence /(x) is linear on ^k- 
Let Fk be a symmetric convex compact such that Fk^^k, ix{^k-Fk)<l/k, and 
Sik be the linear hull of Fk. Choose a sequence pkiO such that 

Ipfc(sup |a:| + sup sup |/„(a:)|)<00 

xeFfc n xeFfc 



and let 



A" = {x: X = I pfcXfc, Xu e Fk). 



It is easy to see that: a) A' is a symmetric convex compact set; b) the linear hull 2 of 
the set K contains all the sets 2k and hence /u(^)= 1; c) /„(x) converges to l{x) 
uniformly on K and therefore /(x) is linear on 2. The necessity of the theorem’s 
conditions is thus verified. 

Sufficiency. Denote Kn={x: (l/n)x e K}. Then 2 = U« Kn and hence for any 
e>0 there exists n such that jji(Kn)> I — e. Clearly Kn is a symmetric convex 
compact set. We show that for any 5 > 0 a continuous linear functional <p(x) exists 
such that \(p(x) — l(x)\<S for x g Kn. Set 

5 i = | a :: l{x)^^r^K„, 52 = | jt ; 

In view of conditions 2) and 3), 5i and S 2 are convex compacts symmetrical with 
respect to the origin 0. Therefore there exists a hyperplane which passes through 
the origin and separates these sets. Let this hyperplane be represented by 
{x: (a, x) = 0} = L. Denote by <p(x)the functional 

(p{x) = l{xoY 

{a, xo) 
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where XqgSi is a point such that {a, Xo) is maximal. (p(x) is a continuous 
functional. Furthermore, 



|/( a :)-< p ( x )|= l{x)-l{^Xo 



(a,x) \ ^ i( _ (a, 

(a, Xo)/ xo)/ 



However, 



Ux-xoj^)eK„nL<=K„\Sr\S2, 

2\ [a, xo)/ 

since \(a, x)/(a, Xo)| ^ 1. Hence 




JC -Xo 



(a, x) 
(a, xo). 



;)) 



s 

2 ’ 



\l{x)-<pix)\<8. □ 



Remark. If l(x) is a ^t-measurable linear functional, then there exists an 
orthonormal basis {ck} in S) such that /(x) is the limit in measure fx of the sequence 
l{Pnx). Indeed, let l(x) = lim„^oo (an, x) in measure /x. There exists an everywhere 
positive function p(x) such that 

lim j [(an, x)-/(x)]^p(x)/Lt((ix) = 0, j \xf^p(x)ix(dx)<oo. 



Let A be a symmetric operator satisfying 

{Az,z) = \{z,x fp (x )p, (dx ). 

As follows from the lemma in Volume I, Chapter V, Section 5, A is a symmetric 
kernel operator. Denote its eigenvectors by {ck} and the corresponding eigen- 
values by Afe. Then 

I x)-(a„, x)f p{x)p,{dx) = (A(a„ - a,„), a„ - a„,) 

oo 

X kk[(an, (dm, ^fc)] • 

fc = l 

Therefore the limits lim„^oo (<^n, exist such that l(^)^ 

Y.k=^i^k(x, 6k ), where the series is convergent in measure p. (cf . Volume I, Chapter 
V, Section 6). 

A measurable function A(x) defined on with values in ^ is called a 
measurable linear operator if a sequence of linear operators A„ exists such that 
A„x is weakly convergent to A(x) in measure p, i.e., for all y e ^ the numerical 
sequence (A„x, y) converges in measure p to (A(x), y). 

Theorem 2. In order that A(x) be a measurable linear operator it is necessary and 
sufficient that there exist a symmetric compact set K such that the following 
conditions are satisfied : 
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1) if 3) is a linear hull of K, then /jl(3))-1; 

2) A(x) is linear on 

3) ^(jc) continuous on K. 

Proof. The necessity of the theorem’s conditions is established in exactly the same 
manner as in Theorem 1. We now prove their sufficiency. Let be a monotone 
sequence of finite-dimensional subspaces such that is dense in and let 
be the projection operator on Since for all x for which A{x) is defined 
PnA (x)^ A (x) as n ^ 00 it is sufficient to show that the operator PnA (x ) for each n 
is a limit of a sequence of operators A^fl\x) convergent in measure fx. However, in 
that case PnA^^\ also converges in measure jx to PnA(x). In order that the latter 
be fulfilled it is sufficient that Ck) converge in measure ix to (PnA(x), ek), 

where {ck] is a basis such that its intercepts are bases in Since (P„A(x), ek) = 
(A(jc), is a ^c-measurable linear functional, in view of Theorem 1 one can find a 
sequence of vectors such that 

{x, a^C^)^{A(x), eO 
as m -> 00 in measure fx. However, then 



X (x, a fc” ' )P„6k ^ PnA (x ) 

k = l 

also in measure (x (only a finite number of summands are nonzero in the sum on 
the left of the last relationship). Thus the required sequence of operator A^fl^ is 
defined by the equation 



(A 



(n) 



c, et)=| 






£ ^n, 

ek^^n. 



and the sufficiency of the conditions of Theorem 2 is verified. □ 
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